diff options
Diffstat (limited to 'Documentation/cpu-freq')
| -rw-r--r-- | Documentation/cpu-freq/boost.txt | 26 | ||||
| -rw-r--r-- | Documentation/cpu-freq/core.txt | 31 | ||||
| -rw-r--r-- | Documentation/cpu-freq/cpu-drivers.txt | 106 | ||||
| -rw-r--r-- | Documentation/cpu-freq/governors.txt | 35 | ||||
| -rw-r--r-- | Documentation/cpu-freq/index.txt | 4 | ||||
| -rw-r--r-- | Documentation/cpu-freq/intel-pstate.txt | 43 | ||||
| -rw-r--r-- | Documentation/cpu-freq/user-guide.txt | 8 |
7 files changed, 203 insertions, 50 deletions
diff --git a/Documentation/cpu-freq/boost.txt b/Documentation/cpu-freq/boost.txt index 9b4edfcf486..dd62e1334f0 100644 --- a/Documentation/cpu-freq/boost.txt +++ b/Documentation/cpu-freq/boost.txt @@ -17,8 +17,8 @@ Introduction Some CPUs support a functionality to raise the operating frequency of some cores in a multi-core package if certain conditions apply, mostly if the whole chip is not fully utilized and below it's intended thermal -budget. This is done without operating system control by a combination -of hardware and firmware. +budget. The decision about boost disable/enable is made either at hardware +(e.g. x86) or software (e.g ARM). On Intel CPUs this is called "Turbo Boost", AMD calls it "Turbo-Core", in technical documentation "Core performance boost". In Linux we use the term "boost" for convenience. @@ -48,24 +48,24 @@ be desirable: User controlled switch ---------------------- -To allow the user to toggle the boosting functionality, the acpi-cpufreq -driver exports a sysfs knob to disable it. There is a file: +To allow the user to toggle the boosting functionality, the cpufreq core +driver exports a sysfs knob to enable or disable it. There is a file: /sys/devices/system/cpu/cpufreq/boost which can either read "0" (boosting disabled) or "1" (boosting enabled). -Reading the file is always supported, even if the processor does not -support boosting. In this case the file will be read-only and always -reads as "0". Explicitly changing the permissions and writing to that -file anyway will return EINVAL. +The file is exported only when cpufreq driver supports boosting. +Explicitly changing the permissions and writing to that file anyway will +return EINVAL. On supported CPUs one can write either a "0" or a "1" into this file. This will either disable the boost functionality on all cores in the -whole system (0) or will allow the hardware to boost at will (1). +whole system (0) or will allow the software or hardware to boost at will +(1). Writing a "1" does not explicitly boost the system, but just allows the -CPU (and the firmware) to boost at their discretion. Some implementations -take external factors like the chip's temperature into account, so -boosting once does not necessarily mean that it will occur every time -even using the exact same software setup. +CPU to boost at their discretion. Some implementations take external +factors like the chip's temperature into account, so boosting once does +not necessarily mean that it will occur every time even using the exact +same software setup. AMD legacy cpb switch diff --git a/Documentation/cpu-freq/core.txt b/Documentation/cpu-freq/core.txt index ce0666e5103..70933eadc30 100644 --- a/Documentation/cpu-freq/core.txt +++ b/Documentation/cpu-freq/core.txt @@ -20,6 +20,7 @@ Contents: --------- 1. CPUFreq core and interfaces 2. CPUFreq notifiers +3. CPUFreq Table Generation with Operating Performance Point (OPP) 1. General Information ======================= @@ -93,6 +94,30 @@ cpu - number of the affected CPU old - old frequency new - new frequency -If the cpufreq core detects the frequency has changed while the system -was suspended, these notifiers are called with CPUFREQ_RESUMECHANGE as -second argument. +3. CPUFreq Table Generation with Operating Performance Point (OPP) +================================================================== +For details about OPP, see Documentation/power/opp.txt + +dev_pm_opp_init_cpufreq_table - cpufreq framework typically is initialized with + cpufreq_frequency_table_cpuinfo which is provided with the list of + frequencies that are available for operation. This function provides + a ready to use conversion routine to translate the OPP layer's internal + information about the available frequencies into a format readily + providable to cpufreq. + + WARNING: Do not use this function in interrupt context. + + Example: + soc_pm_init() + { + /* Do things */ + r = dev_pm_opp_init_cpufreq_table(dev, &freq_table); + if (!r) + cpufreq_frequency_table_cpuinfo(policy, freq_table); + /* Do other things */ + } + + NOTE: This function is available only if CONFIG_CPU_FREQ is enabled in + addition to CONFIG_PM_OPP. + +dev_pm_opp_free_cpufreq_table - Free up the table allocated by dev_pm_opp_init_cpufreq_table diff --git a/Documentation/cpu-freq/cpu-drivers.txt b/Documentation/cpu-freq/cpu-drivers.txt index c436096351f..14f4e6336d8 100644 --- a/Documentation/cpu-freq/cpu-drivers.txt +++ b/Documentation/cpu-freq/cpu-drivers.txt @@ -23,9 +23,10 @@ Contents: 1.1 Initialization 1.2 Per-CPU Initialization 1.3 verify -1.4 target or setpolicy? -1.5 target +1.4 target/target_index or setpolicy? +1.5 target/target_index 1.6 setpolicy +1.7 get_intermediate and target_intermediate 2. Frequency Table Helpers @@ -50,30 +51,39 @@ What shall this struct cpufreq_driver contain? cpufreq_driver.name - The name of this driver. -cpufreq_driver.owner - THIS_MODULE; - cpufreq_driver.init - A pointer to the per-CPU initialization function. cpufreq_driver.verify - A pointer to a "verification" function. cpufreq_driver.setpolicy _or_ -cpufreq_driver.target - See below on the differences. +cpufreq_driver.target/ +target_index - See below on the differences. And optionally -cpufreq_driver.exit - A pointer to a per-CPU cleanup function. +cpufreq_driver.exit - A pointer to a per-CPU cleanup + function called during CPU_POST_DEAD + phase of cpu hotplug process. + +cpufreq_driver.stop_cpu - A pointer to a per-CPU stop function + called during CPU_DOWN_PREPARE phase of + cpu hotplug process. cpufreq_driver.resume - A pointer to a per-CPU resume function which is called with interrupts disabled and _before_ the pre-suspend frequency and/or policy is restored by a call to - ->target or ->setpolicy. + ->target/target_index or ->setpolicy. cpufreq_driver.attr - A pointer to a NULL-terminated list of "struct freq_attr" which allow to export values to sysfs. +cpufreq_driver.get_intermediate +and target_intermediate Used to switch to stable frequency while + changing CPU frequency. + 1.2 Per-CPU Initialization -------------------------- @@ -105,11 +115,18 @@ policy->governor must contain the "default policy" for this CPU. A few moments later, cpufreq_driver.verify and either cpufreq_driver.setpolicy or - cpufreq_driver.target is called with - these values. + cpufreq_driver.target/target_index is called + with these values. -For setting some of these values, the frequency table helpers might be -helpful. See the section 2 for more information on them. +For setting some of these values (cpuinfo.min[max]_freq, policy->min[max]), the +frequency table helpers might be helpful. See the section 2 for more information +on them. + +SMP systems normally have same clock source for a group of cpus. For these the +.init() would be called only once for the first online cpu. Here the .init() +routine must initialize policy->cpus with mask of all possible cpus (Online + +Offline) that share the clock. Then the core would copy this mask onto +policy->related_cpus and will reset policy->cpus to carry only online cpus. 1.3 verify @@ -128,20 +145,31 @@ range) is within policy->min and policy->max. If necessary, increase policy->max first, and only if this is no solution, decrease policy->min. -1.4 target or setpolicy? +1.4 target/target_index or setpolicy? ---------------------------- Most cpufreq drivers or even most cpu frequency scaling algorithms only allow the CPU to be set to one frequency. For these, you use the -->target call. +->target/target_index call. Some cpufreq-capable processors switch the frequency between certain limits on their own. These shall use the ->setpolicy call -1.4. target +1.5. target/target_index ------------- +The target_index call has two arguments: struct cpufreq_policy *policy, +and unsigned int index (into the exposed frequency table). + +The CPUfreq driver must set the new frequency when called here. The +actual frequency must be determined by freq_table[index].frequency. + +It should always restore to earlier frequency (i.e. policy->restore_freq) in +case of errors, even if we switched to intermediate frequency earlier. + +Deprecated: +---------- The target call has three arguments: struct cpufreq_policy *policy, unsigned int target_frequency, unsigned int relation. @@ -159,7 +187,7 @@ Here again the frequency table helper might assist you - see section 2 for details. -1.5 setpolicy +1.6 setpolicy --------------- The setpolicy call only takes a struct cpufreq_policy *policy as @@ -170,6 +198,23 @@ setting when policy->policy is CPUFREQ_POLICY_PERFORMANCE, and a powersaving-oriented setting when CPUFREQ_POLICY_POWERSAVE. Also check the reference implementation in drivers/cpufreq/longrun.c +1.7 get_intermediate and target_intermediate +-------------------------------------------- + +Only for drivers with target_index() and CPUFREQ_ASYNC_NOTIFICATION unset. + +get_intermediate should return a stable intermediate frequency platform wants to +switch to, and target_intermediate() should set CPU to to that frequency, before +jumping to the frequency corresponding to 'index'. Core will take care of +sending notifications and driver doesn't have to handle them in +target_intermediate() or target_index(). + +Drivers can return '0' from get_intermediate() in case they don't wish to switch +to intermediate frequency for some target frequency. In that case core will +directly call ->target_index(). + +NOTE: ->target_index() should restore to policy->restore_freq in case of +failures as core would send notifications for that. 2. Frequency Table Helpers @@ -178,10 +223,10 @@ the reference implementation in drivers/cpufreq/longrun.c As most cpufreq processors only allow for being set to a few specific frequencies, a "frequency table" with some functions might assist in some work of the processor driver. Such a "frequency table" consists -of an array of struct cpufreq_freq_table entries, with any value in -"index" you want to use, and the corresponding frequency in +of an array of struct cpufreq_frequency_table entries, with any value in +"driver_data" you want to use, and the corresponding frequency in "frequency". At the end of the table, you need to add a -cpufreq_freq_table entry with frequency set to CPUFREQ_TABLE_END. And +cpufreq_frequency_table entry with frequency set to CPUFREQ_TABLE_END. And if you want to skip one entry in the table, set the frequency to CPUFREQ_ENTRY_INVALID. The entries don't need to be in ascending order. @@ -207,10 +252,23 @@ int cpufreq_frequency_table_target(struct cpufreq_policy *policy, is the corresponding frequency table helper for the ->target stage. Just pass the values to this function, and the unsigned int index returns the number of the frequency table entry which contains -the frequency the CPU shall be set to. PLEASE NOTE: This is not the -"index" which is in this cpufreq_table_entry.index, but instead -cpufreq_table[index]. So, the new frequency is -cpufreq_table[index].frequency, and the value you stored into the -frequency table "index" field is -cpufreq_table[index].index. +the frequency the CPU shall be set to. + +The following macros can be used as iterators over cpufreq_frequency_table: + +cpufreq_for_each_entry(pos, table) - iterates over all entries of frequency +table. + +cpufreq-for_each_valid_entry(pos, table) - iterates over all entries, +excluding CPUFREQ_ENTRY_INVALID frequencies. +Use arguments "pos" - a cpufreq_frequency_table * as a loop cursor and +"table" - the cpufreq_frequency_table * you want to iterate over. + +For example: + + struct cpufreq_frequency_table *pos, *driver_freq_table; + cpufreq_for_each_entry(pos, driver_freq_table) { + /* Do something with pos */ + pos->frequency = ... + } diff --git a/Documentation/cpu-freq/governors.txt b/Documentation/cpu-freq/governors.txt index c7a2eb8450c..77ec21574fb 100644 --- a/Documentation/cpu-freq/governors.txt +++ b/Documentation/cpu-freq/governors.txt @@ -40,7 +40,7 @@ Most cpufreq drivers (in fact, all except one, longrun) or even most cpu frequency scaling algorithms only offer the CPU to be set to one frequency. In order to offer dynamic frequency scaling, the cpufreq core must be able to tell these drivers of a "target frequency". So -these specific drivers will be transformed to offer a "->target" +these specific drivers will be transformed to offer a "->target/target_index" call instead of the existing "->setpolicy" call. For "longrun", all stays the same, though. @@ -71,7 +71,7 @@ CPU can be set to switch independently | CPU can only be set / the limits of policy->{min,max} / \ / \ - Using the ->setpolicy call, Using the ->target call, + Using the ->setpolicy call, Using the ->target/target_index call, the limits and the the frequency closest "policy" is set. to target_freq is set. It is assured that it @@ -131,8 +131,8 @@ sampling_rate_min: The sampling rate is limited by the HW transition latency: transition_latency * 100 Or by kernel restrictions: -If CONFIG_NO_HZ is set, the limit is 10ms fixed. -If CONFIG_NO_HZ is not set or nohz=off boot parameter is used, the +If CONFIG_NO_HZ_COMMON is set, the limit is 10ms fixed. +If CONFIG_NO_HZ_COMMON is not set or nohz=off boot parameter is used, the limits depend on the CONFIG_HZ option: HZ=1000: min=20000us (20ms) HZ=250: min=80000us (80ms) @@ -167,6 +167,27 @@ of load evaluation and helping the CPU stay at its top speed when truly busy, rather than shifting back and forth in speed. This tunable has no effect on behavior at lower speeds/lower CPU loads. +powersave_bias: this parameter takes a value between 0 to 1000. It +defines the percentage (times 10) value of the target frequency that +will be shaved off of the target. For example, when set to 100 -- 10%, +when ondemand governor would have targeted 1000 MHz, it will target +1000 MHz - (10% of 1000 MHz) = 900 MHz instead. This is set to 0 +(disabled) by default. +When AMD frequency sensitivity powersave bias driver -- +drivers/cpufreq/amd_freq_sensitivity.c is loaded, this parameter +defines the workload frequency sensitivity threshold in which a lower +frequency is chosen instead of ondemand governor's original target. +The frequency sensitivity is a hardware reported (on AMD Family 16h +Processors and above) value between 0 to 100% that tells software how +the performance of the workload running on a CPU will change when +frequency changes. A workload with sensitivity of 0% (memory/IO-bound) +will not perform any better on higher core frequency, whereas a +workload with sensitivity of 100% (CPU-bound) will perform better +higher the frequency. When the driver is loaded, this is set to 400 +by default -- for CPUs running workloads with sensitivity value below +40%, a lower frequency is chosen. Unloading the driver or writing 0 +will disable this feature. + 2.5 Conservative ---------------- @@ -191,6 +212,12 @@ governor but for the opposite direction. For example when set to its default value of '20' it means that if the CPU usage needs to be below 20% between samples to have the frequency decreased. +sampling_down_factor: similar functionality as in "ondemand" governor. +But in "conservative", it controls the rate at which the kernel makes +a decision on when to decrease the frequency while running in any +speed. Load for frequency increase is still evaluated every +sampling rate. + 3. The Governor Interface in the CPUfreq Core ============================================= diff --git a/Documentation/cpu-freq/index.txt b/Documentation/cpu-freq/index.txt index 3d0b915035b..dc024ab4054 100644 --- a/Documentation/cpu-freq/index.txt +++ b/Documentation/cpu-freq/index.txt @@ -35,8 +35,8 @@ Mailing List ------------ There is a CPU frequency changing CVS commit and general list where you can report bugs, problems or submit patches. To post a message, -send an email to cpufreq@vger.kernel.org, to subscribe go to -http://vger.kernel.org/vger-lists.html#cpufreq and follow the +send an email to linux-pm@vger.kernel.org, to subscribe go to +http://vger.kernel.org/vger-lists.html#linux-pm and follow the instructions there. Links diff --git a/Documentation/cpu-freq/intel-pstate.txt b/Documentation/cpu-freq/intel-pstate.txt new file mode 100644 index 00000000000..a69ffe1d54d --- /dev/null +++ b/Documentation/cpu-freq/intel-pstate.txt @@ -0,0 +1,43 @@ +Intel P-state driver +-------------------- + +This driver implements a scaling driver with an internal governor for +Intel Core processors. The driver follows the same model as the +Transmeta scaling driver (longrun.c) and implements the setpolicy() +instead of target(). Scaling drivers that implement setpolicy() are +assumed to implement internal governors by the cpufreq core. All the +logic for selecting the current P state is contained within the +driver; no external governor is used by the cpufreq core. + +Intel SandyBridge+ processors are supported. + +New sysfs files for controlling P state selection have been added to +/sys/devices/system/cpu/intel_pstate/ + + max_perf_pct: limits the maximum P state that will be requested by + the driver stated as a percentage of the available performance. The + available (P states) performance may be reduced by the no_turbo + setting described below. + + min_perf_pct: limits the minimum P state that will be requested by + the driver stated as a percentage of the max (non-turbo) + performance level. + + no_turbo: limits the driver to selecting P states below the turbo + frequency range. + +For contemporary Intel processors, the frequency is controlled by the +processor itself and the P-states exposed to software are related to +performance levels. The idea that frequency can be set to a single +frequency is fiction for Intel Core processors. Even if the scaling +driver selects a single P state the actual frequency the processor +will run at is selected by the processor itself. + +New debugfs files have also been added to /sys/kernel/debug/pstate_snb/ + + deadband + d_gain_pct + i_gain_pct + p_gain_pct + sample_rate_ms + setpoint diff --git a/Documentation/cpu-freq/user-guide.txt b/Documentation/cpu-freq/user-guide.txt index 04f6b32993e..ff2f28332cc 100644 --- a/Documentation/cpu-freq/user-guide.txt +++ b/Documentation/cpu-freq/user-guide.txt @@ -190,11 +190,11 @@ scaling_max_freq show the current "policy limits" (in first set scaling_max_freq, then scaling_min_freq. -affected_cpus : List of CPUs that require software coordination - of frequency. +affected_cpus : List of Online CPUs that require software + coordination of frequency. -related_cpus : List of CPUs that need some sort of frequency - coordination, whether software or hardware. +related_cpus : List of Online + Offline CPUs that need software + coordination of frequency. scaling_driver : Hardware driver for cpufreq. |
