必威体育Betway必威体育官网
当前位置:首页 > IT技术

Linux Thermal

时间:2019-11-01 19:44:30来源:IT技术作者:seo实验室小编阅读:80次「手机版」
 

thermal

文字图片是转载:http://kernel.meizu.com/linux-thermal-framework-intro.html

代码分析是自己的分析

Linux Thermal 是Linux 系统下温度控制相关的模块,主要用来控制系统运行过程中芯片产生的热量,使得芯片温度和设备外壳维持在一个安全的范围。

Thermal 的主要框架

要实现一个温度控制的需求,就需要:获取温度的设备和控制温度的设备,以及一些使用温度控制设备的策略。

获取温度的设备:在Thermal框架中被抽象为Thermal Zone Device;

控制温度的设备:在Thermal框架中被抽象为Thermal Cooling Device;

这里写图片描述

Thermal Zone Device

上面说到Thermal Zone Device是获取温度设备的抽象,怎么抽象的?RTFSC

通过代码我们可以看到,一个能提供温度的设备操作函数主要有:绑定函数、获取温度函数、获取触发点温度函数。

绑定函数:Thermal core用来绑定用的,后面讲;

获取温度函数:获取设备温度用的。一般soc内部会有温度传感器提供温度,有些热敏电阻通过ADC也能算出温度,这个函数就是取这些温度值;

获取触发点温度函数:这个是用来做什么的呢?这个其实是thermal框架里面的一个关键点,因为要控制温度,那么什么时候控制就需要有东西来描述,而描述什么时候控制的东西就是触发点,每个thermal zone device会定义很多触发点,那么每个触发点就是通过该函数获得;

结构体定义的地方是:./include/linux/thermal.h

struct thermal_zone_device {
    int id;
    char type[THERMAL_NAME_LENGTH];
    struct device device;
    struct thermal_attr *trip_temp_attrs;
    struct thermal_attr *trip_type_attrs;
    struct thermal_attr *trip_hyst_attrs;
    void *devdata;
    int trips;
    /*轮询时间*/
    unsigned long trips_disabled;   /* bitmap for disabled trips */
    int passive_delay;
    int polling_delay;
    int temperature;
    int last_temperature;
    int emul_temperature;
    int passive;
    unsigned int forced_passive;
    atomic_t need_update;
    /*设备操作函数*/
    struct thermal_zone_device_ops *ops;
    struct thermal_zone_params *tzp;
    /*降温策略*/
    struct thermal_governor *governor;
    void *governor_data;
    //重要,每个zone的instance列表头@thermal_instances:list of &struct thermal_instance of this thermal zone
    struct list_head thermal_instances;
    struct idr idr;
    struct mutex lock;
    struct list_head node;
    /*用来循环处理的delayed_work*/
    struct delayed_work poll_queue;
    struct sensor_threshold tz_threshold[2];
    struct sensor_info sensor;
};
struct thermal_zone_device_ops {
    /*绑定函数*/
    int (*bind) (struct thermal_zone_device *,struct thermal_cooling_device *);
    int (*unbind) (struct thermal_zone_device *,struct thermal_cooling_device *);
    /*获取温度函数*/
    int (*get_temp) (struct thermal_zone_device *, unsigned long *);
    int (*get_mode) (struct thermal_zone_device *,enum thermal_device_mode *);
    int (*set_mode) (struct thermal_zone_device *,enum thermal_device_mode);
    int (*get_trip_type) (struct thermal_zone_device *, int,enum thermal_trip_type *);
    int (*activate_trip_type) (struct thermal_zone_device *, int,enum thermal_trip_activation_mode);
    /*获取触发点温度*/
    int (*get_trip_temp) (struct thermal_zone_device *, int,unsigned long *);
    int (*set_trip_temp) (struct thermal_zone_device *, int,unsigned long);
    int (*get_trip_hyst) (struct thermal_zone_device *, int,unsigned long *);
    int (*set_trip_hyst) (struct thermal_zone_device *, int,unsigned long);
    int (*get_crit_temp) (struct thermal_zone_device *, unsigned long *);
    int (*set_emul_temp) (struct thermal_zone_device *, unsigned long);
    int (*get_trend) (struct thermal_zone_device *, int,enum thermal_trend *);
    int (*notify) (struct thermal_zone_device *, int,enum thermal_trip_type);
};

Thermal Cooling Devices

Thermal Cooling Devices是可以降温设备的抽象,能降温的设备比如风扇,这些好理解,但是像cpu,GPU,这些Cooling Devices怎么理解呢?

其实CPU,GPU这些Cooling device是通过降低产热量来降温的。而风扇,散热片这些是用来加快散热的。

Thermal Cooling Devices抽象的方式是,认为所有的能降温的设备有很多可以单独控制的状态,例如风扇有不同的风速状态。

CPU/GPU Cooling device 有不同最大运行频率状态,这样当温度高了之后通过调整这些状态来降低温度;

struct thermal_cooling_device {
    int id;
    char type[THERMAL_NAME_LENGTH];
    struct device device;
    struct device_node *np;
    void *devdata;
    /*操作函数*/
    const struct thermal_cooling_device_ops *ops;
    bool updated; /* true if the cooling device does not need update */
    struct mutex lock; /* protect thermal_instances list */
    //同上 ,instances列表的头结点
    struct list_head thermal_instances;
    struct list_head node;
};
struct thermal_cooling_device_ops {
    int (*get_max_state) (struct thermal_cooling_device *, unsigned long *);
    int (*get_cur_state) (struct thermal_cooling_device *, unsigned long *);
    /*设定等级*/
    int (*set_cur_state) (struct thermal_cooling_device *, unsigned long);
    int (*get_requested_power)(struct thermal_cooling_device *,struct thermal_zone_device *, u32 *);
    int (*state2power)(struct thermal_cooling_device *,struct thermal_zone_device *, unsigned long, u32 *);
    int (*power2state)(struct thermal_cooling_device *,struct thermal_zone_device *, u32, unsigned long *);
};

Thermal Governor

Thermal Governor是降温策略的一个抽象,主要是根据温度来选择thermal cooling devices等级的方法,举个简单的例子,当前的温度升高速度很快,选择风扇3挡风,温度升高不快,选择1挡风,这就是一个Governor

很简单,所有的策略都通过throttle这个函数实现,内核已经实现了一些策略,step_wise,user_space,power_allocator,bang_bang,等具体实现算法细节就不展开了。

/**
 * struct thermal_governor - structure that holds thermal governor information
 * @name:   name of the governor
 * @bind_to_tz: callback called when binding to a thermal zone.  If it
 *      returns 0, the governor is bound to the thermal zone,
 *      otherwise it fails.
 * @unbind_from_tz: callback called when a governor is unbound from a
 *          thermal zone.
 * @throttle:   callback called for every trip point even if temperature is
 *      below the trip point temperature
 * @governor_list:  node in thermal_governor_list (in thermal_core.c)
 */
struct thermal_governor {
    char name[THERMAL_NAME_LENGTH];
    int (*bind_to_tz)(struct thermal_zone_device *tz);
    void (*unbind_from_tz)(struct thermal_zone_device *tz);
    /*策略函数*/
    int (*throttle)(struct thermal_zone_device *tz, int trip);
    struct list_head    governor_list;
};

Thermal Core

有了获取温度的设备,有了温控控制的设备,有了控制方法,Thermal Core就负责把这些整合在一起。RTFSC

1.注册函数,Thermal Core通过对外提供注册的接口,让thermal zone device\thermal cooling device\thermal governor注册进来

这个接口函数是增加一个thermal zone device 的sensor 在目录/sys/class/thermal目录下,并且取名为thermal_zone[0-*],同时打算绑定thermal cooling devices 的注册,返回值是指向创建thermal_zone_device的指针

struct thermal_zone_device *thermal_zone_device_register(const char *type,int trips, int mask, void *devdata,struct thermal_zone_device_ops *ops,struct thermal_zone_params *tzp,int passive_delay, int polling_delay)
thermal_zone_device_register() - register a new thermal zone device
@type:  the thermal zone device type
@trips: the number of trip points the thermal zone support
@mask:  a bit string indicating the writeablility of trip points
@devdata:   private device data
@ops:   standard thermal zone device callbacks
@tzp:   thermal zone platform parameters
@passive_delay: number of milliseconds to wait between polls when performing passive cooling
@polling_delay: number of milliseconds to wait between polls when checking whether trip points have been crossed (0 for interrupt driven systems)

这个接口函数是增加一个新的接口函数thermal cooling device (fan/processor/…) 在/sys/class/thermal/文件夹中作为cooling_device[0-*],它对自己是绑定的,返回值是指向thermal_cooling_device 结构体的指针。

struct thermal_cooling_device * thermal_cooling_device_register(char *type, void *devdata,const struct thermal_cooling_device_ops *ops)
thermal_cooling_device_register() - register a new thermal cooling device
@type:  the thermal cooling device type.
@devdata:   device private data.
@ops:       standard thermal cooling devices callbacks.

这个接口是注册thermal governor

int thermal_register_governor(struct thermal_governor *governor)

2.Thermal zone/cooling device 注册过程中thermal core会调用绑定函数,绑定的过程最主要是一个cooling device 绑定到一个thermal_zone的触发点上

这个接口连接thermal cooling device到thermal zone device的某个触发点上。成功返回0

//先贴一个结构体
/*
 * This structure is used to describe the behavior of
 * a certain cooling device on a certain trip point
 * in a certain thermal zone
 */
struct thermal_instance {
    int id;
    char name[THERMAL_NAME_LENGTH];
    struct thermal_zone_device *tz;
    struct thermal_cooling_device *cdev;
    int trip;
    bool initialized;
    unsigned long upper;    /* Highest cooling state for this trip point */
    unsigned long lower;    /* Lowest cooling state for this trip point */
    unsigned long target;   /* expected cooling state */
    char attr_name[THERMAL_NAME_LENGTH];
    struct device_attribute attr;
    char weight_attr_name[THERMAL_NAME_LENGTH];
    struct device_attribute weight_attr;
    struct list_head tz_node; /* 重要node in tz->thermal_instances */
    struct list_head cdev_node; /* 重要node in cdev->thermal_instances */
    unsigned int weight; /* The weight of the cooling device */
};
thermal_zone_bind_cooling_device() - bind a cooling device to a thermal zone
@tz:    pointer to struct thermal_zone_device
@trip:  indicates which trip point the cooling devices is associated with in this thermal zone.
@cdev:  pointer to struct thermal_cooling_device
@upper: the Maximum cooling state for this trip point. THERMAL_NO_limit means no upper limit, and the cooling device can be in max_state.
@lower: the Minimum cooling state can be used for this trip point.THERMAL_NO_LIMIT means no lower limit,and the cooling device can be in cooling state 0.
@weight:The weight of the cooling device to be bound to thethermal zone. Use THERMAL_WEIGHT_DEFAULT for thedefault value
int thermal_zone_bind_cooling_device(struct thermal_zone_device *tz,
                     int trip,
                     struct thermal_cooling_device *cdev,
                     unsigned long upper, unsigned long lower,
                     unsigned int weight)
{
    struct thermal_instance *dev; //用来描述zone和cooling设备在某个trip 上的关系
    struct thermal_instance *pos;
    struct thermal_zone_device *pos1;
    struct thermal_cooling_device *pos2;
    unsigned long max_state;
    int result;

    //使得pos1指向tz设备,pos2指向cooling设备
    list_for_each_entry(pos1, &thermal_tz_list, node) { if (pos1 == tz) break; }
    list_for_each_entry(pos2, &thermal_cdev_list, node) { if (pos2 == cdev) break; }

    //使用cooling设备的get_max_state函数,得到最大等级状态
    cdev->ops->get_max_state(cdev, &max_state);

    /* lower default 0, upper default max_state */
    lower = lower == THERMAL_NO_LIMIT ? 0 : lower;
    upper = upper == THERMAL_NO_LIMIT ? max_state : upper;

    dev = kzalloc(sizeof(struct thermal_instance), GFP_KERNEL); //给dev开辟空间
    dev->tz = tz; //dev得到zone设备
    dev->cdev = cdev; //dev得到cooling设备
    dev->trip = trip; //dev得到温度触发的那个点 
    dev->upper = upper; //dev得到上限
    dev->lower = lower; //dev得到下限
    dev->target = THERMAL_NO_TARGET; // 不知道做啥的
    dev->weight = weight; //dev得到weight

    //调用idr_alloc,动态分配一个id号,并将该id号做为dev的id号
    result = get_idr(&tz->idr, &tz->lock, &dev->id);

    sprintf(dev->name, "cdev%d", dev->id); //用id号做成dev的name
    //一个kobject对象就对应sys目录中的一个设备,代表这些驱动的结构
    //在tz->device.kobj目录下创建指向cdev->device.kobj目录的软链接,name为软链接文件名称。
    result =sysfs_create_link(&tz->device.kobj, &cdev->device.kobj, dev->name);

    sprintf(dev->attr_name, "cdev%d_trip_point", dev->id);// 用id号做成dev的attr_name
    sysfs_attr_init(&dev->attr.attr);// 文件属性的初始化?
    //对属性进行赋值
    dev->attr.attr.name = dev->attr_name;
    dev->attr.attr.mode = 0444;
    dev->attr.show = thermal_cooling_device_trip_point_show; //属性中show函数,具象为一个文件节点cat的调用
    //调用sysfs_create_file()在kobj对应的目录下创建attr对应的属性文件
    result = device_create_file(&tz->device, &dev->attr);

    //大致同上,只是不太清楚weight是用来做啥的
    sprintf(dev->weight_attr_name, "cdev%d_weight", dev->id);
    sysfs_attr_init(&dev->weight_attr.attr);
    dev->weight_attr.attr.name = dev->weight_attr_name;
    dev->weight_attr.attr.mode = S_IWUSR | S_IRUGO;
    dev->weight_attr.show = thermal_cooling_device_weight_show;
    dev->weight_attr.store = thermal_cooling_device_weight_store;
    result = device_create_file(&tz->device, &dev->weight_attr);

    mutex_lock(&tz->lock);  //对zone列表上锁
    mutex_lock(&cdev->lock);  //对cooling列表上锁
    //遍历zone下的thermal_instances列表,看看有没有跟这个准备加入的instances一样的
    list_for_each_entry(pos, &tz->thermal_instances, tz_node)
        if (pos->tz == tz && pos->trip == trip && pos->cdev == cdev) {
        result = -EEXIST; //有
        break;
    }
    if (!result) {  //没有的话,就分别在zone和cooling的设备的instances列表中加入
        list_add_tail(&dev->tz_node, &tz->thermal_instances); //把这个instances加入到zone的instances列表中
        list_add_tail(&dev->cdev_node, &cdev->thermal_instances);//把这个instances加入到cooling的instances列表中
        atomic_set(&tz->need_update, 1);//原子操作,设置值
    }
    mutex_unlock(&cdev->lock);  //对cooling列表解锁
    mutex_unlock(&tz->lock);    //对zone列表解锁

    if (!result)
        return 0;

    device_remove_file(&tz->device, &dev->weight_attr);
    remove_trip_file:device_remove_file(&tz->device, &dev->attr);
    remove_symbol_link:sysfs_remove_link(&tz->device.kobj, dev->name);
    release_idr:release_idr(&tz->idr, &tz->lock, dev->id);
    free_mem:kfree(dev);

    return result;
}
export_symbol_GPL(thermal_zone_bind_cooling_device);//导出符号,在另一个函数中调用

3.Thermal core使能delayed_work循环处理,使得整个thermal控制流程运转起来,当温度升高超过温度触发点的话,就会使能对应的cooling device进行降温处理。

首先在在struct thermal_zone_device *thermal_zone_device_register()中调用中:

a.bind_tz(tz); –__bind–thermal_zone_bind_cooling_device()绑定zone和cooling设备

b.INIT_DELAYED_WORK(&(tz->poll_queue), thermal_zone_device_check);来初始化工作poll_queue以及工作函数thermal_zone_check;

c.if (!tz->ops->get_temp) thermal_zone_device_set_polling(tz, 0);如果tz不存在get_temp这个函数,则调用delay为0的thermal_zone_device_set_polling函数,里面调用cancel_delayed_work(&tz->poll_queue);取消延迟工作

d.thermal_zone_device_reset(tz); 重置这个zone设备,里面包括tz->temperature = THERMAL_TEMP_INvalid;tz->passive = 0;以及对每一个instances的pos->initialized = false;

c.之后是重点:

atomic_cmpxhg()是比较+交换的原子操作,比较need_update的值是否等于1,如果是,则把0赋值给need_update,否则不修改它的值,返回值是need_update赋值前的值。

如果,之前的bind成功,就会通过原子操作使得need_update的值为1

然后调用thermal_zone_device_update(tz)

if (atomic_cmpxchg(&tz->need_update, 1, 0))
        thermal_zone_device_update(tz);
//在thermal_zone_device_update(tz);中
先执行update_temperature   
    --thermal_zone_get_temp(tz, &temp)  --  tz->ops->get_temp(tz, temp)获得temp值
    之后再赋值   
    tz->last_temperature = tz->temperature;
然后进行每个trip温度的处理,就是处理触发点,这里就会调用到具体的governor
for (count = 0; count < tz->trips; count++) handle_thermal_trip(tz, count);
    在handle_thermal_trip函数中,首先通过tz->ops->get_trip_type(tz, trip, &type); 获取每个触发点的类别
    然后根据类别进行不同governor运算handle_critical_trips(tz, trip, type);或者handle_non_critical_trips(tz, trip, type);
    在处理完某个trip点后,我们需要调用monitor_thermal_zone(tz)来重新start 监视器monitor
    在看monitor_thermal_zone函数之前,先看一下zone device结构体的一些用到的成员:
    passive:1 if you've crossed a passive trip point, 0 otherwise. 当这个trip温度被触发后,passive为1,在前面的reset的时候已经置为0
    passive_delay:  number of milliseconds to wait between polls when performing passive cooling.  执行cooling时候的delay时间
    polling_delay:  number of milliseconds to wait between polls when checking whether trip points have been crossed (0 for interrupt driven systems)  平常检查的delay时间
    根据以上三个参数执行函数thermal_zone_device_set_polling,执行如下函数
static void thermal_zone_device_set_polling(struct thermal_zone_device *tz,
                        int delay)
{
    if (delay > 1000)
        //执行延迟工作,delay时间后执行工作tz->poll_queue,用system_freezable_wq线程,因为delay>1000,且用cooling的时候,所以用粗粗的定时器round_jiffies
        mod_delayed_work(system_freezable_wq, &tz->poll_queue, round_jiffies(msecs_to_jiffies(delay)));
    else if (delay)//执行延迟工作,正常的检查温度状态
        mod_delayed_work(system_freezable_wq, &tz->poll_queue,msecs_to_jiffies(delay));
    else //如果delay为0,取消这个工作
        cancel_delayed_work(&tz->poll_queue);
}

下面介绍延迟工作做了什么

static void thermal_zone_device_check(struct work_struct *work)
{
    //通过工作,获得zone的结构体
    struct thermal_zone_device *tz = container_of(work, struct thermal_zone_device, poll_queue.work);
    thermal_zone_device_update(tz);//发现没有,又调用了上面的函数了,获得并且更新温度,进行governor的调度,重新start monitor,然后set polling,一段时间后又进行工作(delay时间,降温就久一点,check就短一点),不断循环
}

这里写图片描述

文章最后发布于: 2018-05-31 14:43:07

相关阅读

Linux中配置java环境

本文所需的jdk1.8版本rpm包:链接: https://pan.baidu.com/s/146rZKd0hP0851MBT68PnEw  提取码: uqif 很多应用的运行需要有Java环

深入理解linux下write()和read()函数

1、write() 函数定义:ssize_t write (int fd, const void * buf, size_t count);  函数说明:write()会把参数buf所指的内存写入cou

编译linux内核源码,安装、删除内核

Linux内核编译、安装流程 本博客属于原创,转载请注明来源 此处只讲linux内核编译步骤至于安装虚拟机,安装ubuntu操作系统请自行百

在linux中挂载磁盘ext3和ext4之间的区别

Linux kernel 自 2.6.28 开始正式支持新的文件系统 Ext4。 Ext4 是 Ext3 的改进版,修改了 Ext3 中部分重要的数据结构,而不仅仅像

Linux提示 /usr/bin/ld:cannot find-lxxx 系列解决方

转自:https://blog.csdn.net/yiliang_/article/details/68928387一般把xx.so文件复制到/usr/lib这篇文章主要介绍了Linux系统中提

分享到:

栏目导航

推荐阅读

热门阅读