【内核调度、负载均衡】【calculate_imbalance】

calculate_imbalance函数对于该版本的kernel在网上没有找到相同的code,所以这里只是通过他的注释来确认。

calculate_imbalance

/*** calculate_imbalance - Calculate the amount of imbalance present(当下的) within the*			 groups of a given sched_domain during load balance.* @env: load balance environment* @sds: statistics of the sched_domain whose imbalance is to be calculated.*/
static inline void calculate_imbalance(struct lb_env *env, struct sd_lb_stats *sds)
{unsigned long max_pull, load_above_capacity = ~0UL;struct sg_lb_stats *local, *busiest;local = &sds->local_stat;busiest = &sds->busiest_stat;if (busiest->group_type == group_imbalanced) {/** In the group_imb case we cannot rely on group-wide averages* to ensure cpu-load equilibrium(平衡), look at wider averages. XXX*//*在group_imb的情况下,我们不能依靠整个组的平均值来确保cpu-load均衡,而要查看更宽的平均值*/busiest->load_per_task =min(busiest->load_per_task, sds->avg_load);}/** Avg load of busiest sg can be less and avg load of local sg can* be greater than avg load across all sgs of sd because avg load* factors in sg capacity and sgs with smaller group_type are* skipped when updating the busiest sg:*//*最繁忙的sg的平均负载可以较小,本地sg的平均负载可以大于sd的所有sg的平均负载,这是因为更新最繁忙的sg时会跳过sg容量和具有较小group_type的sg的平均负载因子:*/if (busiest->group_type != group_misfit_task &&(busiest->avg_load <= sds->avg_load ||local->avg_load >= sds->avg_load)) {env->imbalance = 0;//已经平衡return fix_small_imbalance(env, sds);}/** If there aren't any idle cpus, avoid creating some.// 如果没有空闲的CPU,请避免创建一些CPU。*/if (busiest->group_type == group_overloaded &&local->group_type   == group_overloaded) {load_above_capacity = busiest->sum_nr_running * SCHED_CAPACITY_SCALE;if (load_above_capacity > busiest->group_capacity) {load_above_capacity -= busiest->group_capacity;load_above_capacity *= scale_load_down(NICE_0_LOAD);load_above_capacity /= busiest->group_capacity;/*load_above_capacity = (load_above_capacity - busiest->group_capacity)*scale_load_down(NICE_0_LOAD)/busiest->group_capacity*/} elseload_above_capacity = ~0UL;}/** We're trying to get all the cpus to the average_load, so we don't* want to push ourselves above the average load, nor do we wish to* reduce(减少) the max loaded cpu below the average load. At the same time,* we also don't want to reduce the group load below the group* capacity. Thus we look for the minimum possible imbalance.*//*我们正在尝试将所有的cpus分配给average_load,所以我们不想将自己推到平均负载之上,也不想将最大负载的cpu降低到平均负载之下。 同时,我们也不想将组负载减少到组容量以下。因此我们寻求最小的不平衡*/// load_above_capacity肯定是一个类似avg_load的值max_pull = min(busiest->avg_load - sds->avg_load, load_above_capacity);/* How much load to actually move to equalise(均衡) the imbalance *///max_pull肯定是一个类似avg_load的值env->imbalance = min(max_pull * busiest->group_capacity, (sds->avg_load - local->avg_load) * local->group_capacity) / SCHED_CAPACITY_SCALE;/* Boost imbalance to allow misfit task to be balanced.* Always do this if we are doing a NEWLY_IDLE balance* on the assumption(假设) that any tasks we have must not be* long-running (and hence(因此) we cannot rely upon load).* However if we are not idle, we should assume the tasks* we have are longer running and not override load-based* calculations above unless we are sure that the local* group is underutilized(利用不足).*//*如果我们假设所有任务都不能长时间运行(因此我们不能依赖负载),则在执行NEWLY_IDLE平衡时请务必这样做但是,如果我们不处于空闲状态,则应该假定我们已运行的任务运行时间更长,并且不覆盖上面的基于负载的计算,除非我们确定本地组未得到充分利用*///group_misfit_task_load在update_sg_lb_stats函数中updateif (busiest->group_type == group_misfit_task &&(env->idle == CPU_NEWLY_IDLE ||local->sum_nr_running < local->group_weight)) {env->imbalance = max_t(long, env->imbalance,busiest->group_misfit_task_load);}/** if *imbalance is less than the average load per runnable task* there is no guarantee(保证) that any tasks will be moved so we'll have* a think about bumping(颠簸) its value to force at least one task to be* moved*//*如果不平衡小于每个可运行任务的平均负载,则无法保证将移动任何任务,因此我们将考虑提高其值以强制至少移动一个任务*/if (env->imbalance < busiest->load_per_task)return fix_small_imbalance(env, sds);//这里可以知道fix_small_imbalance就是更新env->imbalance这个值}

计算和判断条件都很多,那找出几个相关因子:

busiest->group_misfit_task_load、sds->avg_load、busiest->group_capacity、SCHED_CAPACITY_SCALE、scale_load_down(NICE_0_LOAD)、busiest->sum_nr_running

fix_small_imbalance

这里先给出code,看清楚他是计算什么,不必知道他的计算细节。后面有时间再来看这个。

/*** fix_small_imbalance - Calculate the minor imbalance that exists*			amongst the groups of a sched_domain, during*			load balancing.* @env: The load balancing environment.* @sds: Statistics of the sched_domain whose imbalance is to be calculated.*/
static inline
void fix_small_imbalance(struct lb_env *env, struct sd_lb_stats *sds)
{unsigned long tmp, capa_now = 0, capa_move = 0;unsigned int imbn = 2;unsigned long scaled_busy_load_per_task;struct sg_lb_stats *local, *busiest;local = &sds->local_stat;busiest = &sds->busiest_stat;if (!local->sum_nr_running)local->load_per_task = cpu_avg_load_per_task(env->dst_cpu);else if (busiest->load_per_task > local->load_per_task)imbn = 1;scaled_busy_load_per_task =(busiest->load_per_task * SCHED_CAPACITY_SCALE) /busiest->group_capacity;if (busiest->avg_load + scaled_busy_load_per_task >=local->avg_load + (scaled_busy_load_per_task * imbn)) {env->imbalance = busiest->load_per_task;return;}/** OK, we don't have enough imbalance to justify moving tasks,* however we may be able to increase total CPU capacity used by* moving them.*/capa_now += busiest->group_capacity *min(busiest->load_per_task, busiest->avg_load);capa_now += local->group_capacity *min(local->load_per_task, local->avg_load);capa_now /= SCHED_CAPACITY_SCALE;/* Amount of load we'd subtract */if (busiest->avg_load > scaled_busy_load_per_task) {capa_move += busiest->group_capacity *min(busiest->load_per_task,busiest->avg_load - scaled_busy_load_per_task);}/* Amount of load we'd add */if (busiest->avg_load * busiest->group_capacity load_per_task * SCHED_CAPACITY_SCALE) {tmp = (busiest->avg_load * busiest->group_capacity) /local->group_capacity;} else {tmp = (busiest->load_per_task * SCHED_CAPACITY_SCALE) /local->group_capacity;}capa_move += local->group_capacity *min(local->load_per_task, local->avg_load + tmp);capa_move /= SCHED_CAPACITY_SCALE;/* Move if we gain throughput */if (capa_move > capa_now) {env->imbalance = busiest->load_per_task;return;}/* We can't see throughput improvement with the load-based* method, but it is possible depending upon group size and* capacity range that there might still be an underutilized* cpu available in an asymmetric capacity system. Do one last* check just in case.*/if (env->sd->flags & SD_ASYM_CPUCAPACITY &&busiest->group_type == group_overloaded &&busiest->sum_nr_running > busiest->group_weight &&local->sum_nr_running < local->group_weight &&local->group_capacity < busiest->group_capacity)env->imbalance = busiest->load_per_task;
}

 


本文来自互联网用户投稿,文章观点仅代表作者本人,不代表本站立场,不承担相关法律责任。如若转载,请注明出处。 如若内容造成侵权/违法违规/事实不符,请点击【内容举报】进行投诉反馈!

相关文章

立即
投稿

微信公众账号

微信扫一扫加关注

返回
顶部