Android underlying memory recycling mechanism introduction - Android, Mechanism, memory, presentation, recycling, underlying

The bottom layer of Android is developed based on the Linux kernel. With the continuous update of Android versions, the memory recovery mechanism is constantly changing. This article briefly introduces the principles of memory recovery under different versions.

Linux OOM mechanism

OOM (out of memory) is a kind of memory management mechanism in Linux. In the case of less available system memory, the kernel The system can continue to run, and it will choose to kill some processes to release some memory. Usually the trigger process of oom_killer is: Process A wants to allocate physical memory -> coarse page faults -> kernel to allocate physical memory -> physical memory Insufficient -> OOM. When OOM occurs, there are two choices:

kernelpanic (dead)
Start oom_killer, traverse all the current processes, according to the memory usage of the process Score the situation, and then select a process with the highest score to kill, thereby reclaiming memory

Main processing flow

Before calling oom_killer, the system will oom_control makes a padding:

pagefault_out_of_memory(void)
{
 struct oom_control oc = {
 .zonelist = NULL,
 .nodemask = NULL,
 .memcg = NULL,
 .gfp_mask = 0,
 .order = 0,
 };
...
 out_of_memory(&oc);
}

The processing of oom_killer is mainly concentrated in mm/oom_kill.c

The core function is out_of_memory

< pre>bool out_of_memory(struct oom_control *oc)
{
unsigned long freed = 0;
enum oom_constraint constraint = CONSTRAINT_NONE;

if (oom_k iller_disabled)
return false;

if (!is_memcg_oom(oc)) {
blocking_notifier_call_chain(&oom_notify_list, 0, &freed);
if (freed> 0)< br /> /* Got some memory back in the last second. */
return true;
}

/*
* If current has a pending SIGKILL or is exiting, then automatically
* select it. The goal is to allow it to allocate so that it may
* quickly exit and free its memory.
*/
if (task_will_free_mem (current)) {
mark_oom_victim(current);
wake_oom_reaper(current);
return true;
}

/*
* The OOM killer does not compensate for IO-less reclaim.
* pagefault_out_of_memory lost its gfp context so we have to
* make sure exclude 0 mask-all other users should have at least
* ___GFP_DIRECT_RECLAIM to get here.
*/
if (oc->gfp_mask && !(oc->gfp_mask & __GFP_FS))
return true;

/*
* Check if there were limitation s on the allocation (only relevant for
* NUMA and memcg) that may require different handling.
*/
constraint = constrained_alloc(oc);
if (constraint != CONSTRAINT_MEMORY_POLICY )
oc->nodemask = NULL;
check_panic_on_oom(oc, constraint);

if (!is_memcg_oom(oc) && sysctl_oom_kill_allocating_task &&
current->mm && !oom_unkillable_task(current, NULL, oc->nodemask) &&
current->signal->oom_score_adj != OOM_SCORE_ADJ_MIN) {
get_task_struct(current);
oc->chosen = current;< br /> oom_kill_process(oc, “Out of memory (oom_kill_allocating_task)”);
return true;
}

select_bad_process(oc);
/* Found nothing ?!?! Either we hang forever, or we panic. */
if (!oc->chosen && !is_sysrq_oom(oc) && !is_memcg_oom(oc)) {
dump_header(oc, NULL) ;
panic(“Out of memory and no killable processes…
“);
}
if (oc->chosen && oc->chosen != (void *)- 1UL) {
oom_kill_process(o c, !is_memcg_oom(oc)? “Out of memory” :
“Memory cgroup out of memory”);
/*
* Give the killed process a good chance to exit before trying< br /> * to allocate memory again.
*/
schedule_timeout_killable(1);
}
return !!oc->chosen;
}

Processing process:

Notify the module registered with oom_nofiy_list in the system to release the memory. If the memory releases some memory from the module, then directly end omm Killer process, if the recovery fails, go to the next step omm_killer;
The triggering of OOM_killer is usually caused by the memory allocation of the current process, and if the current process has suspended a SIG_KILL signal, directly select the current process, otherwise enter The next step
check_panic_on_oom check the system administrator settings, see if oom is directly panic or OOM_killer.
If the original regulations of the management, who caused the oom and killed who, then just kill it directly The process that is trying to allocate memory sysctl_oom_kill_allocating_task
Call select_bad_process to select the appropriate process, and then call OOM_kill_process to kill the selected process. If it is not found, panic is triggered

sysctl_panic_on_oom

This parameter is referenced in the check_panic_on_oom function. When the parameter is equal to 0, the OOM killer is started. When the parameter is equal to 2, if it is not the sysrq process, it is mandatory Enter the kernel panic. When it is equal to other values, it should be divided into specific situations. For some situations, it can be panic, and some situations can start the oom killer.

In the kernel code, enum oom_constraint is a further step Description o The parameters of the om state are defined as follows:

enum oom_constraint {
 CONSTRAINT_NONE,
 CONSTRAINT_CPUSET,
 CONSTRAINT_MEMORY_POLICY,
 CONSTRAINT_MEMCG,
};
 pre> For UMA, oom_constrain is always CONSTRAINT_NONE, which means that the system does not have any constraints before oom appears.
 Under NUMA, other constraints may be added to cause the system to encounter OOM status, in fact, there is sufficient memory in the system. These constraints include:
  OCNSTRAINT_CPUSET
 cpusets is a mechanism in the kernel, through the The mechanism can allocate a set of cpu and memory node resources to a specific set of processes. At this time, if OOM appears, it only means that the node where the process can allocate memory is in a state. There are many memory nodes in the entire system, and other nodes may There are sufficient memory resources.
 
 CONSTRAINT_MEMORY_POLICY
 The memory policy is how to control the allocation of each memory node resource strategy module in the NUMA system. User space programs (NUMA-aware programs) can formulate policies for the entire system, for a specific process, and for a specific VMA of a specific process through the memory policy API. OOM may also be caused by additional memory policy constraints. In this case, it seems a bit inappropriate to panic the entire system. 
 
 CONSTRAINT_MEMCG
 MEMCG is the memory control group. The memory subsystem in the Cgroup is the controller that controls the memory resource allocation of the system. The popular one is to The memory usage of the group process is limited to a range. When the memory usage of this group exceeds the upper limit, it will be OOM. In this case, the OOM is the OOM of CONSTRAINT_MEMCG type. 
 
 
 select_bad_process
 This function selects a suitable process to be killed from the system, and cannot kill key system processes Dead, others are scored by oom_badness, the highest score is selected:
  graph TD; select_bad_process-->oom_evaluate_task oom_evaluate_task-->oom_badness 
 oom_badness
 /**
 * oom_badness-heuristic function to determine which candidate task to kill
 * @p: task struct of which task we should calculate
 * @totalpages: total present RAM allowed for page allocation
 *
 * The heuristic for determining which task to kill is made to be as simple and
 * predictable as possible. The goal is to return the highest value for the
 * task consuming the most memory to avoid subsequent oom failures.
 */
unsigned long oom_badness(struct task_struct *p, struct mem_cgroup *memcg,
 const nodemask_t *nodemask, unsigned long totalpages)
{
 long points;
 long adj;
...
 adj = (long )p->signal->oom_score_a dj;
 if (adj == OOM_SCORE_ADJ_MIN ||
 test_bit(MMF_OOM_SKIP, &p->mm->flags) ||
 in_vfork(p)) {
 task_unlock(p) ;
 return 0;
 }
 /*
 * The baseline for the badness score is the proportion of RAM that each
 * task's rss, pagetable and swap space use.
 */
 points = get_mm_rss(p->mm) + get_mm_counter(p->mm, MM_SWAPENTS) +
 atomic_long_read(&p->mm->nr_ptes) + mm_nr_pmds( p->mm);
 task_unlock(p);

 /*
 * Root processes get 3% bonus, just like the __vm_enough_memory()
 * implementation used by LSMs.
 */
 if (has_capability_noaudit(p, CAP_SYS_ADMIN))
 points -= (points * 3) / 100;

 /* Normalize to oom_score_adj units */
 adj *= totalpages / 1000;
 points += adj;

 /*
 * Never return 0 for an eligible task regardless of the root bonus and
 * oom_score_adj (oom_score_adj can't be OOM_SCORE_ADJ_MIN here).
 */
 return points> 0? points: 1;
}
Introduce the main work of oom_badness in detail:


Line 17 of code
For a task Scoring (oom_score) is mainly composed of two parts:


System scoring is mainly based on the memory usage of the task


 The user scores, that is oom_score_adj


The actual score of the task needs to be scored in two aspects. If the user sets the oom_socre_adj of the task to OOM_SCORE_ADJ_MIN(-1000), it is actually The oom killer is forbidden to kill the process.
Returns 0, which means that the oom killer is notified that the process is "good process". You can see later that the lowest score when calculating the score is 1 point.


27 lines of code
System scoring is based on physical memory consumption, mainly in three parts: RSS, swap fille or swap device memory occupied, page table occupation The memory situation.


Line 35 of code
The root process has 3% memory roommate privileges, so here we need to subtract those memory usage
 

Line 39 of code
Users can adjust the oom_score, the specific operation method is as follows:

Android system

The value range of oom_score_adj is -1000～1000, 0 means that the user does not adjust the oom_score, a negative value means that a discount is to be subtracted from the actual scoring value, and a positive value means that the task is to be punished, that is, to increase the process oom_score. In actual operation, it needs to be calculated based on the memory that can be allocated during this memory allocation (if there is no memory allocation constraint, then it is all available memory in the system, if the system supports cpuset, then the allocatable memory here is the actual quota of the cpuset value). The oom_badness function has an incoming parameter totalpages, which is the upper limit of the memory that can be allocated at that time. The actual points should be adjusted according to oom_score_adj. For example, if oom_score_adj is set to -500, it means that the actual score will be discounted by 50% (the base is totalpages), which means that the actual memory used by the task must be reduced by the allocatable Half of the upper limit of the memory. 


Android kernel LMK mechanism
In Android, after the user exits the current application in time, the application is still Will exist in the system, this is to facilitate the restart of the program. But in this case, as the number of open programs increases, the system's memory will be insufficient, and some processes need to be killed to free up memory space. As for whether it is necessary to kill the process and what process to kill, this is done by Android's internal mechanism LowMemoryKiller mechanism. 
Andorid's Low Memory Killer is a memory management mechanism modified on the basis of the standard linux lernel OOM. When the system memory is insufficient, kill unnecessary processes to release their memory. The choice of unnecessary processes is based on two: oom_adj and the size of the occupied memory. oom_adj represents the priority of the process, the higher the value, the lower the priority month, the easier it is to be killed; each oom_adj can have an idle process threshold. Android Kernel will check whether the current free memory is below a certain threshold every once in a while. If it is, kill the largest unnecessary process of oom_adj until the memory is restored to a state below the threshold. 
Compare oom



Name
Trigger condition




OOM
A page fault occurs when a process applies for memory, and there is not enough remaining memory to allocate


Lowmemorykiller
 Periodically scan the system memory pressure, and when it falls below a certain threshold, the recovery will officially start.



 < h3 id="lmk">LMK initialization
Initialization is mainly for kobject registration and the registration of each notification chain
static int __init lowmem_init(void)
{
 rc = kobject_init_and_add(lowmem_notify_kobj, &lowmem_notify_kobj_type,
 mm_kobj, "lowmemkiller");
 register_shrinker(&lowmem_shrinker);
#ifdefdef_OOM_NOTIFIER
; 
#ifdef CONFIG_E_SHOW_MEM
 register_e_show_mem_notifier(&tasks_e_show_mem_notifier);
#endif
 vmpressure_notifier_register(&lmk_vmpr_nb);
 nl_sk = netlink_kernel_create(&ini t_net, LMK_NETLINK_PROTO, &cfg);
}
The definition of lowmen_shrinker is as follows:
static struct shrinker lowmem_shrinker = {
 .scan_objects = lowmem_scan,
 .count_objects = lowmem_count,
 .seeks = DEFAULT_SEEKS * 16,
 .flags = SHRINKER_LMK
};
static short lowmem_adj[6] = {
 0 ,
 1,
 6,
 12,
};
static int lowmem_minfree[6] = {
 3 * 512, /* 6MB */
 2 * 1024, /* 8MB */
 4 * 1024, /* 16MB */
 16 * 1024, /* 64MB */
}; The lowmem_adj data will be filled during system operation. The specific value can be obtained at the following node:
 /sys/module/lowmemorykiller/parameters/minfree: inside is divided by "," A set of numbers, each number represents a memory level /sys/module/lowmemorykiller/parameters/adj: corresponding to the above set of numbers, each array represents a process priority level
  sp9832e_1h10: /sys/module/lowmemorykiller/parameters # cat minfree; cat adj 18432,23040,27648,32256,55296,80640 0,100,200,300,900,906
 
 Meaning: two sets of numbers correspond to each other, when a mobile phone When the memory is lower than 80640, the process of priority 906 and above will be killed. When the memory is below 55296, the process of priority 900 and above will be killed. 
 The memory recovery function is mainly implemented in the lowmem_scan function:
 static unsigned long lowmem_scan(struct shrinker *s, struct shrink_control *sc)
{
 /* work around for antutu */
 struct task_struct *selected_antutu = NULL;
 int selected_antutu_tasksize = 0;
 short selected_antutu_adj = -1000;
 bool has_antutu_3D = false;
< br />#ifdef CONFIG_LOWMEM_NOTIFY_KOBJ
 lowmem_notif_sc.gfp_mask = sc->gfp_mask;
 if (get_free_ram(&other_free, &other_file_orig, &other_file, sc)) {
 if (mutex_isn_locked(mutex) msleep(1);

 if (!mutex_is_locked(&kernfs_mutex))
 lowmem_notify_killzone_approach();
 else
 lowmem_print(1, "skip as kernfs_mutex is locked .");
 }
#else
 get_current_ram(&other_free, &other_file_orig, &other_file, sc);
#endif

 for (i = 0; i  minfree = lowmem_minfree[i];
 if (other_free  min_score_adj = lowmem_ad j[i];
 break;
 }
 }

 ret = adjust_minadj(&min_score_adj, &pressure);

 selected_oom_score_adj = min_score_adj; 

 for_each_process(tsk) {
 struct task_struct *p;
 short oom_score_adj;

 if (tsk->flags & PF_KTHREAD)
 continue;

 if (time_before_eq(jiffies, lowmem_deathpending_timeout)) {
 if (test_task_flag(tsk, TIF_MEMDIE)) {
 rcu_read_unlock();
 mutex_unlock(&scan_mutex) ;
 return 0;
 }
 }

 /* workaround for antutu */
 if (strstr("com.antutu.benchmark.full" , p->comm))
 has_antutu_3D = true;

 oom_score_adj = p->signal->oom_score_adj;
 if (oom_score_adj  task_unlock( p);
 continue;
 }
 tasksize = get_mm_rss(p->mm);
 task_unlock(p);
 if (tasksize <= 0)
 continue;
 if (selected) {
 if (oom_score_adj  continue ;
 if (oom_score_adj == selected_oom_score_adj &&
 tasksize <= selected_tasksize)
 continue;
 }
 /* workaround for antutu */
 if (! selected_antutu &&
 strstr("com.antutu.ABenchMark", p->comm)) {
 selected_antutu = p;
 selected_antutu_tasksize = tasksize;
 selected_antutu_adj = oom_score_adj;
 continue;
 }
 selected = p;
 selected_tasksize = tasksize;
 selected_oom_score_adj = oom_score_adj;
 lowmem_print(2, "select'%s' (%d) , adj %hd, size %d, to kill
",
 p->comm, p->pid, oom_score_adj, tasksize);
 }
 /* workaround for antutu:< br /> * if 3D task is not exist, check if the antutu task is more suited
 * to be killed
 */
 if (selected && selected_antutu && !has_antutu_3D) {
 if (selected_antutu_adj> selected_oom_score_adj ||
 (selected_antutu_adj == selected_oom_score_adj &&
 selected_antutu_tasksize> selected_ tasksize)) {
 selected = selected_antutu;
 selected_tasksize = selected_antutu_tasksize;
 selected_oom_score_adj = selected_antutu_adj;
 }
 }
 if (selected) {
 long cache_size = other_file * (long)(PAGE_SIZE / 1024);
 long cache_size_orig = other_file_orig * (long)(PAGE_SIZE / 1024);
 long cache_limit = minfree * (long)(PAGE_SIZE / 1024) ;
 long free = other_free * (long)(PAGE_SIZE / 1024);

 if (test_task_flag(selected, TIF_MEMDIE) &&
 (test_task_state(selected, TASK_UNINTERRUPTIBLE))) { 
 lowmem_print(2, "'%s' (%d) is already killed
",
 selected->comm,
 selected->pid);
 rcu_read_unlock() ;
 mutex_unlock(&scan_mutex);
 return 0;
 }

 task_lock(selected);
 /* add for lmfs */
 selected_process_uid = from_kuid(&init_user_ns,
 selected->cred->uid);
 selected_process_pid = selected->pid;
 selected_process_a dj = selected_oom_score_adj;

 send_sig(SIGKILL, selected, 0);
 /*
 * FIXME: lowmemorykiller shouldn't abuse global OOM killer
 * infrastructure. There is no real reason why the selected
 * task should have access to the memory reserves.
 */
 if (selected->mm)
 mark_oom_victim(selected);
 task_unlock(selected);
 trace_lowmemory_kill(selected, cache_size, cache_limit, free);
 si_swapinfo(&si);

 lowmem_deathpending_timeout = jiffies + HZ;
 rem + = selected_tasksize;
 trace_almk_shrink(selected_tasksize, ret,
 other_free, other_file, selected_oom_score_adj);
} else {
 trace_almk_shrink(1, ret, other_free, other_file, 0);
 }


 if (selected) {
 send_killing_app_info_to_user(selected_process_uid,
 selected_process_pid,
 selected_process_adj);
 return rem;
}
Android LMKD mechanism
 < p>The following introduces the new user space lowmemorykiller daemon (lmkd) function and its configuration method in Android 9

Code location platform/system/core/lmkd/ 

In the past, Android used the lowmemorykiller driver in the kernel to relieve memory pressure (by terminating unnecessary processes). This mechanism is very strict and depends on hard-coded values. In addition, starting with kernel version 4.12, the lowmemorykiller driver will be excluded from the upstream kernel. 
User space? The lmkd? process can achieve the same function, but it uses the existing kernel mechanism to detect and estimate memory pressure. This process uses vmpressure events generated by the kernel to get notifications about memory pressure levels. In addition, it can also use the memory cgroup function to limit the memory resources allocated to the corresponding process (according to the importance of each process)
ProcessList defines the priority of the process, the more important the priority of the process The lower the priority, the priority of the foreground APP is 0, and the priority of the system APP is generally negative. Therefore, the general process management and killing process are for the upper-level APP, and the priority adjustment of these processes is in AMS Inside, AMS continuously calculates the priority of each process according to the status of the components in the process. After the calculation, it will be updated to the file node of the corresponding process in time, and this update of the file node is not completed by it, but lmkd, they communicate through sockets. 
lmkd is a resident process in the mobile phone, used to process the upper ActivityManager after updateOomAdj, communicate with lmkd through the socket, update the priority of the process, if necessary, kill the process to release memory. lmkd is started when the init process starts, and lmkd.rc is defined in lmkd:
service lmkd /system/bin/lmkd
 class core
 group root readproc
 critical
 socket lmkd seqpacket 0660 system system
 socket lmfs stream 0660 root system
 socket vmpressure stream 0666 root system
 writepid /dev/cpuset/system-background/tasks Configure the kernel to support LMKD
 Starting from Android 9, user space? lmkd? will be activated when the kernel lowmemorykiller driver is not detected . Please note that user space? lmkd? requires the kernel to support memory cgroups. Therefore, to use user space instead? lmkd, you should compile the kernel with the following configuration settings:
 CONFIG_ANDROID_LOW_MEMORY_KILLER=n
CONFIG_MEMCG=y
CONFIG_MEMCG_SWAP=y 
LMKD termination strategy
lmkd supports new termination strategies based on the following:
 < li>vmpressure event
severity
Other tips (such as swap utilization swap utilization)
Old mode (in this mode , lmkd will make the termination decision like the kernel lowmemorykiller driver). 

The new termination strategy is different for devices with insufficient memory and high-performance devices. For devices with insufficient memory, under normal circumstances, the system will choose to withstand greater memory pressure; for high-performance devices, if there is memory pressure, it is an abnormal situation and should be repaired in time to avoid affecting the overall performance. The ro.config.low_ram attribute allows you to select one of these modes. For instructions on how to set this property, see Low Memory Configuration. 
In the old mode, the lmkd termination decision was made based on the available memory and file cache thresholds. You can set the ro.lmk.use_minfree_levels property to true to enable this mode. 
Configure LMKD for a specific device
 < th>Default value



Properties
Usage




ro.config.low_ram
Choose between low-memory devices and high-performance devices . 
false


ro.lmk.use_minfree_levels
Use available memory and file cache thresholds to determine when to terminate. This mode is the same as the previous working principle of the kernel lowmemorykiller driver. 
false


ro.lmk.low
The lowest oom_adj score of a process that can be terminated at a low vmpressure level. 
1001 (disabled)


ro.lmk.medium
 Processes that can be terminated under medium vmpressure level Lowest oom_adj score. 
800 (cached or non-essential service)


ro.lmk.critical
Can be terminated under critical vmpressure level The lowest oom_adj score for the process. 
0 (any process)


ro.lmk.critical_upgrade
Can be upgraded to critical level. 
false


ro.lmk.upgrade_pressure
Due to the excessive number of system exchanges, the mem_pressure of the vmpressure event will be upgraded at this level Upper limit. 
100 (disabled)


ro.lmk.downgrade_pressure
As there is still enough free memory, it will be The level ignores the lower limit of mem_pressure* for vmpressure events. 
100 (disabled)


ro.lmk.kill_heaviest_task
Terminate the most important task that meets the criteria (best decision ) And any eligible task (quick decision). 
true


ro.lmk.kill_timeout_ms
The duration (in milliseconds) from one termination to the completion of other terminations As the unit). 
0 (disabled)


ro.lmk.debug
Enable lmkd debug log . 
false




*Note: *mem_pressure = memory usage/RAM_and_swap usage (expressed as a percentage) 
Recycling normally at low level; swapping starts at medium level; critical is running out of memory

implementation
AMS and LMKD communication command
There are five main types. Each command represents a data control method, which is defined in ProcessList and lmkd:
 LMK_TARGET: Update the minfree and adj in /sys/module/lowmemorykiller/parameters/
LMK_PROCPRIO: Update the priority of the specified process, that is, oom_socre_adj
LMK_PROCREMOVE: Remove the process
Purge: Clean up all Registered process
LMK_GETKILLCNT: Get the number of kill processes
Data structure
Data structure used to describe handle events strong>

struct event_handler_info {
 int data;
 void (*handler)(int data, uint32_t events);
};
Define Structure variable describing vmpressure event: vmpressure_hinfo
Data structure used to describe socket events
struct sock_event_handler_info {
 int sock;
 struct event_handler_info handler_info;
};
Used to define two data: ctrl_sock data_sock
table, similar to hashtable, but the way to calculate index is not hash, but oom_score_adj is converted directly as index. Each element of the array is a two-way circular linked list process The priority is used as the index of the array. That is, the priority of the process is the index, and the array is from -1000 to +1000 + 1, according to the priority, the process index of the same priority is the same. Each element is a doubly linked list, this linked list All procs on the above have the same priority. This is very convenient when killing processes according to their priority. To kill a process with a specified priority, you can get a process linked list according to the priority and kill them one by one. 
static struct adjslot_list procadjslot_list[ADJTOSLOT(OOM_SCORE_ADJ_MAX) + 1];
Timing diagram

lmkd workflow
Entrance main
int main(int argc __unused, char **argv __unused) {
 struct sched_param param = {
 .sched_priority = 1,
 };
 /* 2019/04/22 11:15:44 by jinliang
 * Lock the memory used by this process now and in the future in physical memory to prevent it from being swapped
 */
 if (mlockall(MCL_CURRENT | MCL_FUTURE | MCL_ONFAULT) && (errno != EINVAL)) {
 ALOGW("mlockall failed %s", strerror(errno));
 }

 /* CAP_NICE required */
 /* 2019/04/22 11:18:12 by jinliang */
 //Set the scheduling strategy to FIFO
 if (sched_setscheduler(0, SCHED_FIFO , ¶m)) {
 ALOGW("set SCHED_FIFO failed %s", strerror(errno));
 }
 if (lmfs_enabled)
 start_lmfs();
 /* 
 * Enter the infinite loop and wait for the fd event
 */
 mainloop();
 }
init
/*1. Initialize the socket monitoring interface ctrl_sock
 *2. Create an epollfd socket epollfd
 *3. Fill in epoll_event epev
 * 1) The event monitored by epoll is EPOLLIN
 * 2) The monitored fd is ctrl_sock.sock
 * 2) The callback function is ctrl_sock.handler_info
 *4. Monitoring event registration*/
static int init(void) {
 struct epoll_event epev;

/* 2019/04/15 16:39:00 by jinliang */
/*---get _SC_PAGESIZE value---
 * PAGE_SIZE is 4096,so total about 16M?
 */
 page_k = sysconf(_SC_PAGESIZE);
 if (page_k == -1)
 page_k = PAGE_SIZE;
 page_k /= 1024;
/*Create an epoll handle*/
 epollfd = epoll_create(MAX_EPOLL_EVENTS);
/*get the lmkd control socket fd to listen the AMS command*/
 ctrl_sock.sock = android_get_control_socket("lmkd");
 /*list the socket command pass by AMS*/
 /* listen(): listen for connection requests from the client's tcp socket
 * #include< br /> * int listen(int sockfd, int backlog)
 * The parameter sockfd is the socket used by the listen function.
 * The parameter backlog is the length of the listening queue. */
 ret = listen(ctrl_sock.sock, MAX_DATA_CONN);

 epev.events = EPOLLIN;
 ctrl_sock.handler_info.handler = ctrl_connect_handler;
 epev.data .ptr = (void *)&(ctrl_sock.handler_info);
 /* Register listener*/
 if (epoll_ctl(epollfd, EPOLL_CTL_ADD, ctrl_sock.sock, &epev) == -1) {
 ALOGE("epoll_ctl for lmkd control socket failed (errno=%d)", errno);
 return -1;
 }
 maxevents++;
 if (use_inkernel_interface) { 
 ALOGI("Using in-kernel low memory killer interface");
} else {
 /*Set monitoring /dev/memcg/memory.pressure_level and /dev/memcg/cgroup.event_control */
 if (!init_mp_common(VMPRESS_LEVEL_LOW) ||
 !init_mp_common(VMPRESS_LEVEL_MEDIUM) ||
 !init_mp_common(VMPRESS_LEVEL_CRITICAL)) {
 ALOGE("Kernel does not support memory pressure events or in-kernel low memory killer");
 return -1;
 }
 }
 return 0;
}
Process the data passed by the socket
The main function of this step is to maintain the minfree and adj and process data linked lists
Accept is processed in the ctrl_connect_handler method, and the data is read and processed in ctrl_data_handler:
static void ctrl_command_handler(int dsock_idx) {
 len = ctrl_data_read(dsock_idx, (char * )packet, CTRL_PACKET_MAX_SIZE);
 cmd = lmkd_pack_get_cmd(packet);
 switch(cmd) {
 case LMK_TARGET:
 targets = nargs / 2;
 if (nargs & 0x1 || targets> (int)ARRAY_SIZE(lowmem_adj))
 goto wronglen;
 cmd_target(targets, packet);
 break;
 case LMK_PROCPRIO:
 if (nargs != 3)
 goto wronglen;
 cmd_procprio(packet);
 break;
 case LMK_PROCREMOVE:
 if (nargs != 1)
 goto wronglen;
        cmd_procremove(packet);
        break;
    case LMK_PROCPURGE:
        if (nargs != 0)
            goto wronglen;
        cmd _procpurge();
        break;
    case LMK_GETKILLCNT:
        if (nargs != 2)
            goto wronglen;
        kill_cnt = cmd_getkillcnt(packet);
        len = lmkd_pack_set_getkillcnt_repl(packet, kill_cnt);
        if (ctrl_data_write(dsock_idx, (char *)packet, len) != len)
            return;
        break;
}
在use_inkernel_interface的情况下，做的事情都是很简单的，只是更新一下文件节点。如果不使用kernel interface，就需要lmkd自己维护两个table，在每次更新adj的时候去更新table。且在初始化的时候也能看到，如果不使用kernel的lowmemorykiller，则需要lmkd自己获取手机内存状态，如果匹配到了minfree中的等级，则需要通过杀掉一些进程释放内存。 
杀进程
杀进程主要是通过之前在init_mp_common 中设置的memory.pressure_level监听回调函数实现mp_event_common:
static void mp_event_common(int data, uint32_t events __unused) {
    /*
     * when only LMK enabled, we can pass vmpressure & swap-pressure to
     * PerformanceManagerService because-of enable_adaptive_lmk,
     * PerformanceManagerService then force-stop apps from the LRU list
     * according to current vmpressure & swap-pressure;
     *
     * when only MEMCG enabled, the vmpressure & swap-pressure should also
     * be passed to PerformanceManagerService
     */
    if (!use_inkernel_interface) {
        enum vmpressure_level level = (enum vmpressure_level)data;
        int vmpressure_value = 0;
        switch (level) {
            case VMPRESS_LEVEL_LOW:
                vmpressure_value = 70;
                break;
            case VMPRE SS_LEVEL_MEDIUM:
                vmpressure_value = 80;
                break;
            case VMPRESS_LEVEL_CRITICAL:
                vmpressure_value = 90;
                break;
            default:
                break;
        }
        handle_vmpressure(vmpressure_value);
    }
}
经过层层调用 mp_event_common->handle_vmpressure->find_and_kill_process_adj->find_and_kill_process_adj_locked->kill_one_process

参考文档:

https://source.android.com/devices/tech/perf/lmkd
https://www.sohu.com/a/238012686_467784
http://gityuan.com/2016/09/17/android-lowmemorykiller/
https://blog.csdn.net/u011733869/article/details/78820240

??

graph TD; select_bad_process-->oom_evaluate_task oom_evaluate_task-->oom_badness

Linux OOM mechanism

Main processing flow

Android kernel LMK mechanism

Android LMKD mechanism

Configure the kernel to support LMKD

LMKD termination strategy

Configure LMKD for a specific device

implementation

AMS and LMKD communication command

Data structure

Timing diagram

lmkd workflow

Entrance main

init

Process the data passed by the socket

杀进程

Leave a Comment Cancel reply