1.介紹Linux休眠提供了一種類似于Windows的休眠方式,使使用者能夠通過休眠操作,儲存系統目前的記憶體資料到硬碟,即s w a p分區中。當計算機重新啟動後,系統重新裝載儲存的記憶體資料,包括程序資料,寄存器數值等,并恢複到關機前的狀态。由于不需要重新裝載文檔,應用程式也不用重新打開,是以休眠啟動方式要比正常的啟動過程快得多。
2.Linux休眠原理要實作作業系統的休眠,首先要了解linux的記憶體管理機制。标準L i n u x的分頁是三級頁表結構:頁目錄、中間頁目錄和頁。i 3 8 6采用的是兩級頁表結構:頁目錄和頁,不支援中間頁目錄。4 G的線性位址空間,隻有一個頁目錄,它最多有1024個目錄項,每個目錄項又含有1024個頁面項,每個頁面有4 K位元組。分頁機制通過把線性位址空間中的頁,重新定位到實體位址空間來進行管理,因為每個頁面的整個4K位元組作為一個機關進行映射,并且每個頁面都對齊4K位元組的邊界,是以,線性位址的低12位經過分頁機制直接地作為實體位址的低1 2位使用。下圖所示是x86下線性位址映射為實體位址的過程:休眠過程可以分為兩個階段,一是SUSPEND階段,二是R E S U M E階段, R E S U M E過程是S U S P E N D的逆過程。S U S P E N D階段儲存程序資料到硬碟中,并關機;RESUME階段,從硬碟中讀取儲存的程序資料,并恢複到關機前的原始狀态。休眠需要解決的問題中,最重要的部分是記憶體資料的儲存和如何恢複儲存的記憶體資料。我們可以很容易擷取記憶體頁面資料,SUSPEND的過程中,主要任務就是要儲存這些需要儲存的頁面,但是,作為存儲頁面位址的頁表也需要儲存下來,因為頁表僅僅是一個中間轉換作用的連結清單,是以,可以在S U S P E N D的過程中,臨時建立,然後将記憶體頁面位址記錄在頁表中。RESUME的階段,将儲存的頁面和頁表寫到記憶體頁中,完成後,隻要重新修改頁目錄資料,就完成記憶體資料還原動作了。經過以上分析,可以得到休眠的大體原理圖,如下所示:如圖所示,實作S U S P E N D需要完成三個主要步驟:當機系統中的活動程序,準備儲存記憶體資料,寫記憶體資料到硬碟。當機活動程序:包括三類主要的活動源,即,使用者空間程序和核心線程,裝置驅動和活動的計時器;準備儲存資料:計算需要儲存的記憶體頁數,配置設定記憶體以儲存程序資料,複制程序資料到配置設定的記憶體中;儲存資料到硬碟:寫需要儲存的記憶體頁到硬碟中。RESUME是SUSPEND的逆過程,要完成配置設定記憶體以讀取硬碟中的程序資料,讀取硬碟資料,重新映射頁表位址,更新段描述符表等。
3 Linux軟體休眠實作休眠以子產品方式實作,使用者可以根據自己的需要選擇是否裝載此子產品。但是,因為休眠在R E S U M E的過程中,需要恢複關機前的記憶體資料,以及c p u狀态等,是以,此子產品的裝載應該通過ramdisk的init自動裝載,并且要在mount root檔案系統之前。
3.1 SUSPEND階段3.1.1當機活動程序程序執行時,它會根據具體情況改變狀态。Linux中的程序狀态主要有以下幾種:T A S K _ R U N N I N G可運作T
A S K _ I N T E R R U P T I B L E可中斷的等待狀态T A S K _ U N I N T E
R R U P T I B L E不可中斷的等待狀态T A S K _ Z O M B I E僵死T A S K _ S T O P P E D暫停T A S K _ S W A P P I N G換入/換出作業系統在運作過程中,一般有十幾個,甚至幾十個程序在運作。S U S P E N D程序獲得執行的資源而執行,即目前程序(current),是不能被當機和中止執行,否則後續的操作會得不到完全執行;另外,程序标志為P F _ N O F R E E E Z
E和P F _ F R O Z E N的;以及程序狀态為T A S K _ Z O M B I E、T A S K _ D E A D、T A S K _ S T O P P E
D,這些程序是不能當機的或者不需要當機的。除此之外,其餘的程序需要當機,也就是改變程序标志為P F _ F R E E Z E。程序标志改為P F _ F R E E Z E後,相應的程序會因為獲不到資源,進而處于靜止狀态。3.1.2準備儲存資料檢測所有記憶體頁,如果頁面辨別不是PG_reserved,則需要儲存的頁面數加1。記憶體檢測完成後,得到需要儲存的頁面數目,即nr_copy_pages。for (pfn = 0; pfn < max_pfn; pfn++){page =
pfn_to_page(pfn);if (!PageReserved(page)){….nr_copy_pages ++….}…由nr_copy_pages數目,得到記憶體中對應數目的空閑頁面作為頁表目錄數,同時配置設定nr_copy_pages個空閑頁,頁位址由頁表目錄記錄管理。除了程序資料外,目前寄存器的資料,包括描述符表,段寄存器,控制寄存器,以及通用寄存器的值,都作為全局變量儲存下來。複制需要儲存的記憶體頁面到新配置設定的空閑頁中。for (pfn = 0; pfn
< max_pfn; pfn++) {….if (pagedir_p) {pagedir_p->orig_address
=ADDRESS(pfn);copy_page((void *) pagedir_p->address,(void *) pagedir_p->orig_address);pagedir_p++;}….}3.1.3儲存資料到swap分區
摘要:休眠操作通過儲存目前系統程序資料和cpu狀态資料到硬碟中,當系統斷電并重新啟動後,又自動讀取儲存的資料并恢複到原始系統狀态,如此大大減少了系統的啟動時間。記憶體管理,程序管理和swap操作等方面是休眠實作的主要涉及範圍,是以對于深入了解linux作業系統有所幫助。
關鍵詞:Linux;核心;休眠; swap__
Freezing of tasks
(C) 2007 Rafael J. Wysocki <>, GPL
I. What is the freezing of tasks?
The freezing of tasks is a mechanism by
which user space processes and some
kernel threads are controlled during hibernation or system-wide suspend (on
some
architectures).
II. How does it work?
There are four per-task flags used for
that, PF_NOFREEZE, PF_FROZEN, TIF_FREEZE
and PF_FREEZER_SKIP (the last one is auxiliary). The tasks that have
PF_NOFREEZE unset (all user space processes and some kernel threads) are
regarded as 'freezable' and treated in a special way before the system enters a
suspend state as well as before a hibernation image is created (in what follows
we only consider hibernation, but the description also applies to suspend).
Namely, as the first step of the
hibernation procedure the function
freeze_processes() (defined in kernel/power/process.c) is called. It
executes
try_to_freeze_tasks() that sets TIF_FREEZE for all of the freezable tasks and
either wakes them up, if they are kernel threads, or sends fake signals to
them,
if they are user space processes. A task that has TIF_FREEZE set, should
react
to it by calling the function called refrigerator() (defined in
kernel/power/process.c), which sets the task's PF_FROZEN flag, changes its
state
to TASK_UNINTERRUPTIBLE and makes it loop until PF_FROZEN is cleared for it.
Then, we say that the task is 'frozen' and therefore the set of functions
handling this mechanism is referred to as 'the freezer' (these functions are
defined in kernel/power/process.c and include/linux/freezer.h). User
space
processes are generally frozen before kernel threads.
It is not recommended to call
refrigerator() directly. Instead, it is
recommended to use the try_to_freeze() function (defined in
include/linux/freezer.h), that checks the task's TIF_FREEZE flag and makes the
task enter refrigerator() if the flag is set.
For user space processes try_to_freeze()
is called automatically from the
signal-handling code, but the freezable kernel threads need to call it
explicitly in suitable places or use the wait_event_freezable() or
wait_event_freezable_timeout() macros (defined in include/linux/freezer.h)
that combine interruptible sleep with checking if TIF_FREEZE is set and calling
try_to_freeze(). The main loop of a freezable kernel thread may look like
the
following one:
set_freezable();
do {
hub_events();
wait_event_freezable(khubd_wait,
!list_empty(&hub_event_list) ||
kthread_should_stop());
} while (!kthread_should_stop() || !list_empty(&hub_event_list));
(from
drivers/usb/core/hub.c::hub_thread()).
If a freezable kernel thread fails to call
try_to_freeze() after the freezer has
set TIF_FREEZE for it, the freezing of tasks will fail and the entire
hibernation operation will be cancelled. For this reason, freezable
kernel
threads must call try_to_freeze() somewhere or use one of the
wait_event_freezable() and wait_event_freezable_timeout() macros.
After the system memory state has been
restored from a hibernation image and
devices have been reinitialized, the function thaw_processes() is called in
order to clear the PF_FROZEN flag for each frozen task. Then, the tasks
that
have been frozen leave refrigerator() and continue running.
III. Which kernel threads are freezable?
Kernel threads are not freezable by
default. However, a kernel thread may clear
PF_NOFREEZE for itself by calling set_freezable() (the resetting of PF_NOFREEZE
directly is strongly discouraged). From this point it is regarded as
freezable
and must call try_to_freeze() in a suitable place.
IV. Why do we do that?
Generally speaking, there is a couple of
reasons to use the freezing of tasks:
1. The principal reason is to prevent
filesystems from being damaged after
hibernation. At the moment we have no simple means of checkpointing
filesystems, so if there are any modifications made to filesystem data and/or
metadata on disks, we cannot bring them back to the state from before the
modifications. At the same time each hibernation image contains some
filesystem-related information that must be consistent with the state of the
on-disk data and metadata after the system memory state has been restored from
the image (otherwise the filesystems will be damaged in a nasty way, usually
making them almost impossible to repair). We therefore freeze tasks that
might
cause the on-disk filesystems' data and metadata to be modified after the
hibernation image has been created and before the system is finally powered
off.
The majority of these are user space processes, but if any of the kernel
threads
may cause something like this to happen, they have to be freezable.
2. Next, to create the hibernation image
we need to free a sufficient amount of
memory (approximately 50% of available RAM) and we need to do that before
devices are deactivated, because we generally need them for swapping out.
Then,
after the memory for the image has been freed, we don't want tasks to allocate
additional memory and we prevent them from doing that by freezing them earlier.
[Of course, this also means that device drivers should not allocate substantial
amounts of memory from their .suspend() callbacks before hibernation, but this
is e separate issue.]
3. The third reason is to prevent user
space processes and some kernel threads
from interfering with the suspending and resuming of devices. A user
space
process running on a second CPU while we are suspending devices may, for
example, be troublesome and without the freezing of tasks we would need some
safeguards against race conditions that might occur in such a case.
Although Linus Torvalds doesn't like the
freezing of tasks, he said this in one
of the discussions on LKML ():
"RJW:> Why we freeze tasks at all
or why we freeze kernel threads?
Linus: In many ways, 'at all'.
I _do_ realize the IO request queue
issues, and that we cannot actually do
s2ram with some devices in the middle of a DMA. So we want to be able to
avoid *that*, there's no question about that. And I suspect that stopping
user threads and then waiting for a sync is practically one of the easier
ways to do so.
So in practice, the 'at all' may become a
'why freeze kernel threads?' and
freezing user threads I don't find really objectionable."
Still, there are kernel threads that may
want to be freezable. For example, if
a kernel that belongs to a device driver accesses the device directly, it in
principle needs to know when the device is suspended, so that it doesn't try to
access it at that time. However, if the kernel thread is freezable, it
will be
frozen before the driver's .suspend() callback is executed and it will be
thawed after the driver's .resume() callback has run, so it won't be accessing
the device while it's suspended.
4. Another reason for freezing tasks is to
prevent user space processes from
realizing that hibernation (or suspend) operation takes place. Ideally,
user
space processes should not notice that such a system-wide operation has
occurred
and should continue running without any problems after the restore (or resume
from suspend). Unfortunately, in the most general case this is quite
difficult
to achieve without the freezing of tasks. Consider, for example, a
process
that depends on all CPUs being online while it's running. Since we need
to
disable nonboot CPUs during the hibernation, if this process is not frozen, it
may notice that the number of CPUs has changed and may start to work
incorrectly
because of that.
V. Are there any problems related to the
freezing of tasks?
Yes, there are.
First of all, the freezing of kernel
threads may be tricky if they depend one
on another. For example, if kernel thread A waits for a completion (in
the
TASK_UNINTERRUPTIBLE state) that needs to be done by freezable kernel thread B
and B is frozen in the meantime, then A will be blocked until B is thawed,
which
may be undesirable. That's why kernel threads are not freezable by
default.
Second, there are the following two
problems related to the freezing of user
space processes:
1. Putting processes into an uninterruptible sleep distorts the load average.
2. Now that we have FUSE, plus the framework for doing device drivers in
userspace, it gets even more complicated because some userspace processes are
now doing the sorts of things that kernel threads do
().
The problem 1. seems to be fixable,
although it hasn't been fixed so far. The
other one is more serious, but it seems that we can work around it by using
hibernation (and suspend) notifiers (in that case, though, we won't be able to
avoid the realization by the user space processes that the hibernation is
taking
place).
There are also problems that the freezing
of tasks tends to expose, although
they are not directly related to it. For example, if request_firmware()
is
called from a device driver's .resume() routine, it will timeout and eventually
fail, because the user land process that should respond to the request is
frozen
at this point. So, seemingly, the failure is due to the freezing of
tasks.
Suppose, however, that the firmware file is located on a filesystem accessible
only through another device that hasn't been resumed yet. In that case,
request_firmware() will fail regardless of whether or not the freezing of tasks
is used. Consequently, the problem is not really related to the freezing
of
tasks, since it generally exists anyway.
A driver must have all firmwares it may
need in RAM before suspend() is called.
If keeping them is not practical, for example due to their size, they must be
requested early enough using the suspend notifier API described in notifiers.txt.