#

# Summary

#

It was brought to my attention that network namespace creation scalability was affected during kernel development.

The following script was used for all the tests and charts generation:

#!/bin/bash

IP=/sbin/ip

function add_fake_router_uuid() {

    j=`uuidgen`

    $IP netns add bar-${j}

    $IP netns exec bar-${j} $IP link set lo up

    $IP netns exec bar-${j} sysctl -w net.ipv4.ip_forward=1 > /dev/null

    k=`echo $j | cut -b -11`

    $IP link add qro-${k} type veth peer name qri-${k} netns bar-${j}

    $IP link add qgo-${k} type veth peer name qgi-${k} netns bar-${j}

}

for i in `seq 1 $1`; do

    if [ `expr $i % 250` -eq 0 ]; then

        echo "$i by `date +%s`"

    fi

    add_fake_router_uuid

done

I measured how many "fake routers" (above script) could be added per second from 0 to 4000 created routers mark. Using this script and a git bisect on kernel tree I was led to one specific commit causing regression: #911af50 "rcu: Provide compile-time control for no-CBs CPUs". Even Though this change was experimental at that point, it introduced a performance scalability regression (explained below) that still last and seems to be the default option for distributions nowadays.

RCU related code looked like to be responsible for the problem. With that, every commit from tag v3.8..master that changed any of this files: "kernel/rcutree.c kernel/rcutree.h kernel/rcutree_plugin.h include/trace/events/rcu.h include/linux/rcupdate.h" was tested. The idea was to check performance regression during rcu development. In the worst case, the regression not being related to rcu, I would still have data to interpret the performance/scalability regression.

All text below this refer to 2 groups of charts, generated during the study:

1) Kernel git tags from 3.8 to 3.14.

2) Kernel git commits for rcu development (111 commits).

Since there was difference in results depending on how many cpus or how the no-cb cpus were configured, 3 kernel config options were used on every measure:

- CONFIG_RCU_NOCB_CPU (disabled): nocbno

- CONFIG_RCU_NOCB_CPU_ALL (enabled): nocball

- CONFIG_RCU_NOCB_CPU_NONE (enabled): nocbnone

Obs: For 1 cpu cases: nocbno, nocbnone, nocball behaves the same since w/ only 1 cpu there is no no-cb cpu

After charts being generated it was clear that NOCB_CPU_ALL (4 cpus) affected the "fake routers" creation process performance and this regression continues up to upstream version. It was also clear that, after commit #911af50, having more than 1 cpu does not improve performance/scalability for netns, makes it worse.

#911af50

...

+#ifdef CONFIG_RCU_NOCB_CPU_ALL

+   pr_info("\tExperimental no-CBs for all CPUs\n");

+   cpumask_setall(rcu_nocb_mask);

+#endif /* #ifdef CONFIG_RCU_NOCB_CPU_ALL */

...

Comparing standing out points (see charts):

#81e5949 - good

#911af50 - bad

I was able to see that, from the script above, the following lines causes major impact on netns scalability/performance:

1) ip netns add -> huge performance regression:

1 cpu: no regression

4 cpu: regression for NOCB_CPU_ALL

obs: regression from 250 netns/sec to 50 netns/sec on 500 netns already created mark

2) ip netns exec -> some performance regression

1 cpu: no regression

4 cpu: regression for NOCB_CPU_ALL

obs: regression from 40 netns (+1 exec per netns creation) to 20 netns/sec on 500 netns created mark

#

# Assumption (to be confirmed)

#

rcu callbacks being offloaded to other cpus caused regression in copy_net_ns<-created_new_namespaces or unshare(clone_newnet).


#

# Kernel .config

#

#

# Ubuntu default config file for kernel

#

config-3.8.0-39-generic does not configure NOCB

# CONFIG_RCU_NOCB_CPU is not set

CONFIG_RCU_FAST_NO_HZ=y

config-3.11.0-20-generic configures NOCB and NOCB_ALL

        

CONFIG_RCU_NOCB_CPU=y

# CONFIG_RCU_NOCB_CPU_NONE is not set

# CONFIG_RCU_NOCB_CPU_ZERO is not set

CONFIG_RCU_NOCB_CPU_ALL=y

CONFIG_RCU_FAST_NO_HZ=yls

config-3.13.0-24-generic configures NOCB and NOCB_ALL

        

CONFIG_RCU_NOCB_CPU=y

CONFIG_RCU_NOCB_CPU_ALL=y

CONFIG_RCU_FAST_NO_HZ=y

#

# Test case config file for kernel

#

All kernels ending with "-nocb" and tests like "nocbno-xxx"

# CONFIG_RCU_NOCB_CPU is not set

CONFIG_RCU_FAST_NO_HZ=y

All kernels ending with "-nonone" and tests like "nocbnone-xxx"

CONFIG_RCU_NOCB_CPU=y

CONFIG_RCU_NOCB_CPU_NONE=y

# CONFIG_RCU_NOCB_CPU_ZERO is not set

# CONFIG_RCU_NOCB_CPU_ALL is not set

CONFIG_RCU_FAST_NO_HZ=y

All kernels ending "-nocball" and tests like "nocball-xxx"

CONFIG_RCU_NOCB_CPU=y

# CONFIG_RCU_NOCB_CPU_NONE is not set

# CONFIG_RCU_NOCB_CPU_ZERO is not set

CONFIG_RCU_NOCB_CPU_ALL=y

CONFIG_RCU_FAST_NO_HZ=y

#

# How to see charts

#

In charts/250.html, if you check only blue, orange, "nocbnone-1cpu-250" and "nocbnone-4cpu-250" it will give you an overview of the performance regression from kernel version 3.9-rc2 to  3.15-rc5. 1 cpu is performing better than 4 cpus. Moving to the next chart (NEXT->), on the 500 fake routers created mark, the chart continues to show this behavior.

This procedure can be repeated comparing "nocbno", "nocbnone" and "nocball" for 1 and 4 cpus.

# notes

- 1 cpu performance is better than 4 cpu performance (does not scale)

#

# General Observation

#

1) Before #911af50 (#007) there was no such option as NOCB_CPU_ALL or NOCB_CPU_NONE. You can check that up to that commit the 1 cpu performance was lower than 4 cpu performance and the right throughput would be only those for "nocbno-1cpu-XXX" and "nocbno-4cpu-XXX".

2) The commits showed on the charts are commits that touched following files: kernel/rcutree.c kernel/rcutree.h kernel/rcutree_plugin.h include/trace/events/rcu.h include/linux/rcupdate.h. Of course some other change/commit on the source code in between this commits can cause some kind of regression but, since the git bisect bad started with 911af50 (#007) the decision was to test all rcu changes.

3) All the observations were made on the "250 fake routers created" mark. It means that the test had just started. Sometimes it scales more (up to 2500 fake routers) sometimes not. The objective here was to set a tangible parameter for the analysis.

#

# This hole test started as a kernel bisect, resulting in a bad commit: 911af50 (#007).

# Group of changes that introduced performance regression for this testcase:

#

All comments (nocbno, nocbnone, nocball) are in the following format:

# 1st: performance regression 4 cpus: #007 (3.9.0-rc2)

----        * 65d798f rcu: Kick adaptive-ticks CPUs that are holding up RCU grace periods

#021        | *-.   6d87669 Merge branches 'doc.2013.03.12a', 'fixes.2013.03.13a' and 'idlenocb.2013.03.26b' into HEAD

----        | |\ \  

----        |/ / /  

#020        | | * 910ee45 rcu: Make rcu_accelerate_cbs() note need for future grace periods

#019        | | * 0446be4 rcu: Abstract rcu_start_future_gp() from rcu_nocb_wait_gp()

#018        | | * 8b425aa rcu: Rename n_nocb_gp_requests to need_future_gp

#017        | | * b846208 rcu: Push lock release to rcu_start_gp()'s callers

#016        | | * bd9f068 rcu: Repurpose no-CBs event tracing to future-GP events

#015        | | * b92db6c rcu: Rearrange locking in rcu_start_gp()

#014        | | * c0f4dfd rcu: Make RCU_FAST_NO_HZ take advantage of numbered callbacks

#013        | | * b11cc57 rcu: Accelerate RCU callbacks at grace-period end

#012        | | * 5e44ce3 rcu: Export RCU_FAST_NO_HZ parameters to sysfs

#011        | | * a488985 rcu: Distinguish "rcuo" kthreads by RCU flavor

#010        | | * 09c7b89 rcu: Add event tracing for no-CBs CPUs' grace periods

#009        | | * 21e7a60 rcu: Add event tracing for no-CBs CPUs' callback registration

#008        | | * dae6e64 rcu: Introduce proper blocking to no-CBs kthreads GP waits

#007        | | * 911af50 rcu: Provide compile-time control for no-CBs CPU

----        | | * 34ed624 rcu: Remove restrictions on no-CBs CPUs

----        | |/  

----        |/|  

#006        | * 81e5949 rcu: Tone down debugging during boot-up and shutdown.

#005        | * 6231069 rcu: Add softirq-stall indications to stall-warning messages

#004        | * 6f0a6ad rcu: Delete unused rcu_node "wakemask" field

#002        | * b0f7403 rcu: Avoid invoking RCU core on offline CPUs

----        *---.   40393f5 Merge branches 'doctorture.2013.01.29a', 'fixes.2013.01.26a', 'tagcb.2013.01.24a' and 'tiny.2013.01.29b'

OBS:

!!!!        1 CPU        - good performance for nocbno, nocbnone, nocball

!!!!        4 CPU        - good performance for nocbno, nocbnone

????        4 CPU        - bad performance for nocball

### 2nd: performance regression 4 cpus: #014 (3.9.0-rc2)

----        * 65d798f rcu: Kick adaptive-ticks CPUs that are holding up RCU grace periods

#021        | *-.   6d87669 Merge branches 'doc.2013.03.12a', 'fixes.2013.03.13a' and 'idlenocb.2013.03.26b' into HEAD

----        | |\ \  

----        |/ / /  

#020        | | * 910ee45 rcu: Make rcu_accelerate_cbs() note need for future grace periods

#019        | | * 0446be4 rcu: Abstract rcu_start_future_gp() from rcu_nocb_wait_gp()

#018        | | * 8b425aa rcu: Rename n_nocb_gp_requests to need_future_gp

#017        | | * b846208 rcu: Push lock release to rcu_start_gp()'s callers

#016        | | * bd9f068 rcu: Repurpose no-CBs event tracing to future-GP events

#015        | | * b92db6c rcu: Rearrange locking in rcu_start_gp()

#014        | | * c0f4dfd rcu: Make RCU_FAST_NO_HZ take advantage of numbered callbacks

#013        | | * b11cc57 rcu: Accelerate RCU callbacks at grace-period end

#012        | | * 5e44ce3 rcu: Export RCU_FAST_NO_HZ parameters to sysfs

#011        | | * a488985 rcu: Distinguish "rcuo" kthreads by RCU flavor

#010        | | * 09c7b89 rcu: Add event tracing for no-CBs CPUs' grace periods

#009        | | * 21e7a60 rcu: Add event tracing for no-CBs CPUs' callback registration

#008        | | * dae6e64 rcu: Introduce proper blocking to no-CBs kthreads GP waits

#007        | | * 911af50 rcu: Provide compile-time control for no-CBs CPUs

----        | | * 34ed624 rcu: Remove restrictions on no-CBs CPUs

----        | |/  

----        |/|  

----        | * 81e5949 rcu: Tone down debugging during boot-up and shutdown.

OBS:

!!!!        1 CPU        - good performance for nocbno, nocbnone, nocball

????        4 CPU        - bad performance for nocbno, nocbnone (introduced on commit #014)

????        4 CPU        - bad performance for nocball (introduced on commit #007)

### 3rd: bad performance stall 4 cpus: between #014 and #021 (3.9.0-rc2 and 3.9.0)

----        * 65d798f rcu: Kick adaptive-ticks CPUs that are holding up RCU grace periods

#021        | *-.   6d87669 Merge branches 'doc.2013.03.12a', 'fixes.2013.03.13a' and 'idlenocb.2013.03.26b' into HEAD

----        | |\ \  

----        |/ / /  

#020        | | * 910ee45 rcu: Make rcu_accelerate_cbs() note need for future grace periods

#019        | | * 0446be4 rcu: Abstract rcu_start_future_gp() from rcu_nocb_wait_gp()

#018        | | * 8b425aa rcu: Rename n_nocb_gp_requests to need_future_gp

#017        | | * b846208 rcu: Push lock release to rcu_start_gp()'s callers

#016        | | * bd9f068 rcu: Repurpose no-CBs event tracing to future-GP events

#015        | | * b92db6c rcu: Rearrange locking in rcu_start_gp()

#014        | | * c0f4dfd rcu: Make RCU_FAST_NO_HZ take advantage of numbered callbacks

#013        | | * b11cc57 rcu: Accelerate RCU callbacks at grace-period end

#012        | | * 5e44ce3 rcu: Export RCU_FAST_NO_HZ parameters to sysfs

#011        | | * a488985 rcu: Distinguish "rcuo" kthreads by RCU flavor

#010        | | * 09c7b89 rcu: Add event tracing for no-CBs CPUs' grace periods

#009        | | * 21e7a60 rcu: Add event tracing for no-CBs CPUs' callback registration

#008        | | * dae6e64 rcu: Introduce proper blocking to no-CBs kthreads GP waits

#007        | | * 911af50 rcu: Provide compile-time control for no-CBs CPUs

----        | | * 34ed624 rcu: Remove restrictions on no-CBs CPUs

----        | |/  

----        |/|  

----        | * 81e5949 rcu: Tone down debugging during boot-up and shutdown.

OBS:

!!!!        1 CPU        - good performance for nocbno, nocbnone, nocball

????        4 CPU        - bad performance for nocbno, nocbnone (introduced on commit #014)

????        4 CPU        - bad performance for nocball (introduced on commit #007)

### 4th: temporary "improvement" 1 and 4 cpus: between #021 (3.9.0-rc2) and #025 (3.9.0)

OBS: improvement just because of different merges

OBS: last commit, #025 (1f889ec), merges rcu changes into 3.9.0 release

----        * efc151c rcu: Convert rcutree_plugin.h printk calls

----        * d7f3e20 rcu: Convert rcutree.c printk calls

----        * 971394f rcu: Fix deadlock with CPU hotplug, RCU GP init, and timer migration

----        * 016a8d5 rcu: Don't call wakeup() with rcu_node structure ->lock held

----        * 12bcbe6 rcu: Add _notrace variation of rcu_dereference_raw() and hlist_for_each_entry_rcu()

----        * 6faf728 rcu: Fix comparison sense in rcu_needs_cpu()

----        * c032862 Merge commit '8700c95adb03' into timers/nohz

#023        * d1e43fa nohz: Ensure full dynticks CPUs are RCU nocbs

#022        * 65d798f rcu: Kick adaptive-ticks CPUs that are holding up RCU grace periods

#021        | *-.   6d87669 Merge branches 'doc.2013.03.12a', 'fixes.2013.03.13a' and 'idlenocb.2013.03.26b' into HEAD

----        | |\ \  

----        |/ / /  

#020        | | * 910ee45 rcu: Make rcu_accelerate_cbs() note need for future grace periods

#019        | | * 0446be4 rcu: Abstract rcu_start_future_gp() from rcu_nocb_wait_gp()

#018        | | * 8b425aa rcu: Rename n_nocb_gp_requests to need_future_gp

#017        | | * b846208 rcu: Push lock release to rcu_start_gp()'s callers

#016        | | * bd9f068 rcu: Repurpose no-CBs event tracing to future-GP events

#015        | | * b92db6c rcu: Rearrange locking in rcu_start_gp()

#014        | | * c0f4dfd rcu: Make RCU_FAST_NO_HZ take advantage of numbered callbacks

#013        | | * b11cc57 rcu: Accelerate RCU callbacks at grace-period end

#012        | | * 5e44ce3 rcu: Export RCU_FAST_NO_HZ parameters to sysfs

#011        | | * a488985 rcu: Distinguish "rcuo" kthreads by RCU flavor

#010        | | * 09c7b89 rcu: Add event tracing for no-CBs CPUs' grace periods

#009        | | * 21e7a60 rcu: Add event tracing for no-CBs CPUs' callback registration

#008        | | * dae6e64 rcu: Introduce proper blocking to no-CBs kthreads GP waits

#007        | | * 911af50 rcu: Provide compile-time control for no-CBs CPUs

----        | | * 34ed624 rcu: Remove restrictions on no-CBs CPUs

----        | |/  

----        |/|  

----        | * 81e5949 rcu: Tone down debugging during boot-up and shutdown.

OBS:

!!!!        1 CPU        - good performance for nocbno, nocbnone, nocball

!!!!        4 CPU        - excellent performance for nocbno, nocbnone, nocball

### 5th: Zeroed performance 4 cpus: between #25 (3.9.0) and #49 (3.10.0-rc5)

----        * 77bd397 sched: Update rq clock before migrating tasks out of dying CPU

#049        *---.   be77f87 Merge 'cbnum.2013.06.10a' 'doc.2013.06.10a' 'fixes.2013.06.10a' 'srcu.2013.06.10a' and 'tiny.2013.06.10a'

----        |\ \ \  

----        | | | * 1496144 rcu: Shrink TINY_RCU by reworking CPU-stall ifdefs

#048        | | | * 2439b69 rcu: Shrink TINY_RCU by moving exit_rcu()

----        | | | * 7807acd rcu: Remove TINY_PREEMPT_RCU tracing documentation

----        | | | * 318bdcd rcu: Consolidate rcutiny_plugin.h ifdefs

----        | | | * fa2b3b0 rcu: Remove rcu_preempt_note_context_switch()

----        | | | * 57f1801 rcu: Remove the CONFIG_TINY_RCU ifdefs in rcutiny.h

----        | | | * 4879c84 rcu: Remove check_cpu_stall_preempt()

#047        | | | * 9dc5ad3 rcu: Simplify RCU_TINY RCU callback invocation

----        | | | * 58c4e69 rcu: Remove rcu_preempt_process_callbacks()

----        | | | * 47d6593 rcu: Remove rcu_preempt_remove_callbacks()

----        | | | * 9acaac8 rcu: Remove rcu_preempt_check_callbacks()

----        | | | * 221304e rcu: Remove show_tiny_preempt_stats()

#046        | | | * 127781d rcu: Remove TINY_PREEMPT_RCU

----        | | | | * 99f8891 rcu: Remove srcu_read_lock_raw() and srcu_read_unlock_raw().

----        | | | |/  

----        | | * | 676c3dc rcu: Apply Dave Jones's NOCB Kconfig help feedback

----        | | * | 4982969 rcu: Merge adjacent identical ifdefs

#045        | | * | 026ad28 rcu: Drive quiescent-state-forcing delay from HZ

#044        | | * | 9a5739d rcu: Remove "Experimental" flags

----        | | |/  

----        | * | f7bac9b kthread: Add kworker kthreads to OS-jitter documentation

----        | * | ce5f4fc nohz_full: Document additional restrictions

----        | * | 295fde8 nohz_full: Update based on Sedat Dilek review

----        | |/  

#043        * | 05eb552 rcu: Move redundant call to note_gp_changes() into called function

#042        * | ce3d9c0 rcu: Inline trivial wrapper function rcu_start_gp_per_cpu()

#041        * | 63274cf rcu: Eliminate check_for_new_grace_period() wrapper function

#040        * | ba9fbe9 rcu: Merge __rcu_process_gp_end() into __note_gp_changes()

#039        * | 470716f rcu: Switch callers from rcu_process_gp_end() to note_gp_changes()

#037        * | d34ea32 rcu: Rename note_new_gpnum() to note_gp_changes()

#036        * | 398ebe6 rcu: Make __note_new_gpnum() check for ends of prior grace periods

#035        * | 6eaef63 rcu: Move code to apply callback-numbering simplifications

----        |/  

#038        * efc151c rcu: Convert rcutree_plugin.h printk calls

#034        * d7f3e20 rcu: Convert rcutree.c printk calls

----        | * 521921b kvm: Move guest entry/exit APIs to context_tracking

----        | * 45eacc6 vtime: Use consistent clocks among nohz accounting

#033        * 971394f rcu: Fix deadlock with CPU hotplug, RCU GP init, and timer migration

----        * d628409 trace: Allow idle-safe tracepoints to be called from irq

#028        * 6faf728 rcu: Fix comparison sense in rcu_needs_cpu()

----        * 49717cb kthread: Document ways of reducing OS jitter due to per-CPU kthreads

----        | * 265f22a sched: Keep at least 1 tick per second for active dynticks tasks

----        | * 73c3082 rcu: Fix full dynticks' dependency on wide RCU nocb mode

#026        | * c032862 Merge commit '8700c95adb03' into timers/nohz

OBS:

!!!!        1 CPU        - good performance for nocbno, nocbnone, nocball

????        4 CPU        - zero performance for nocbno, nocbnone (introduced on commit #014)

????        4 CPU        - bad performance for nocball (introduced on commit #007)

### 6th: small performance improvement 4 cpus: between #048 (3.10.0-rc5) and #066 (3.11.0-rc2)

#066        *   25f27ce Merge branches 'doc.2013.08.19a', 'fixes.2013.08.20a', 'sysidle.2013.08.31a' and 'torture.2013.08.20a'

----        |\  

#065        | * eb75767 nohz_full: Force RCU's grace-period kthreads onto timekeeping CPU

#064        | * 0edd1b1 nohz_full: Add full-system-idle state machine

#062        | * 217af2a nohz_full: Add full-system-idle arguments to API

#061        | * d4bd54f nohz_full: Add full-system idle states and variables

#060        | * eb348b8 nohz_full: Add per-CPU idle-state tracking

#059        | * 2333210 nohz_full: Add rcu_dyntick data for scalable detection of all-idle state

#058        | * feed66e rcu: Eliminate unused APIs intended for adaptive ticks

#063        * 458fb38 rcu: Simplify _rcu_barrier() processing

#057        * 1eafd31 rcu: Avoid redundant grace-period kthread wakeups

#056        * ae15018 rcu: Make call_rcu() leak callbacks for debug-object errors

#049        *---.   be77f87 Merge 'cbnum.2013.06.10a' 'doc.2013.06.10a' 'fixes.2013.06.10a' 'srcu.2013.06.10a' 'tiny.2013.06.10a'

----        |\ \ \  

#048        | | | * 2439b69 rcu: Shrink TINY_RCU by moving exit_rcu()

OBS:

!!!!        1 CPU        - good performance for nocbno, nocbnone, nocball

????        4 CPU        - small performance improvement for nocbno, nocbnone (introduced on commit #014)

????        4 CPU        - performance improvement for nocball (introduced on commit #007) but still 1/2 of 1 cpu performance

 

### 7th: another small performance improvement 4 cpus: between #066 (3.11.0-rc2) and #090 (3.12.0-rc1)

#093        * 4102ada rcu: Move RCU-related source code to kernel/rcu directory

#092        *   2529973 Merge branch 'idle.2013.09.25a' into HEAD

----        |\  

#090        | * 5c173eb rcu: Consistent rcu_is_watching() naming

#089        | * f9ffc31e rcu: Change EXPORT_SYMBOL() to EXPORT_SYMBOL_GPL()

#088        | * cc6783f rcu: Is it safe to enter an RCU read-side critical section?

#087        | * c337f8f rcu: Throttle invoke_rcu_core() invocations due to non-lazy callbacks

#086        | * c229828 rcu: Throttle rcu_try_advance_all_cbs() execution

#085        | * 7a497c9 rcu: Remove redundant code from rcu_cleanup_after_idle()

#091        * | 25e03a7 Merge branch 'gp.2013.09.25a' into HEAD

#083        * | 15f5191 rcu: Avoid sparse warnings in rcu_nocb_wake trace event

#082        * | 69a79bb rcu: Track rcu_nocb_kthread()'s sleeping and awakening

#081        * | 756cbf6 rcu: Distinguish between NOCB and non-NOCB rcu_callback trace events

#080        * | 9261dd0 rcu: Add tracing for rcuo no-CBs CPU wakeup handshake

#079        * | bb311ec rcu: Add tracing of normal (non-NOCB) grace-period requests

#078        * | 63c4db7 rcu: Add tracing to rcu_gp_kthread()

#077        * | 591c6d1 rcu: Flag lockless access to ->gp_flags with ACCESS_ONCE()

#076        * | 88d6df6 rcu: Prevent spurious-wakeup DoS attack on rcu_gp_kthread()

#075        * | f7be820 rcu: Improve grace-period start logic

----        |/  

#074        | * 0d75292 rcu: Have rcutiny tracepoints use tracepoint_string()

#073        | * 26cdfed rcu: Reject memory-order-induced stall-warning false positives

#072        | * 69c8d28 rcu: Micro-optimize rcu_cpu_has_callbacks()

#071        | * 289828e rcu: Silence unused-variable warnings

#069        | * 829511d rcu: Fix dubious "if" condition in __call_rcu_nocb_enqueue()

#068        | * 01896f7 rcu: Convert local functions to static

#067        | * b3f2d02 rcu: Use proper cpp macro for ->gp_flags

----        |/  

#066        *   25f27ce Merge branches 'doc.2013.08.19a', 'fixes.2013.08.20a', 'sysidle.2013.08.31a' and 'torture.2013.08.20a'

----        |\  

#065        | * eb75767 nohz_full: Force RCU's grace-period kthreads onto timekeeping CPU

#064        | * 0edd1b1 nohz_full: Add full-system-idle state machine

#062        | * 217af2a nohz_full: Add full-system-idle arguments to API

#061        | * d4bd54f nohz_full: Add full-system idle states and variables

#060        | * eb348b8 nohz_full: Add per-CPU idle-state tracking

#059        | * 2333210 nohz_full: Add rcu_dyntick data for scalable detection of all-idle state

#058        | * feed66e rcu: Eliminate unused APIs intended for adaptive ticks

#063        * 458fb38 rcu: Simplify _rcu_barrier() processing

#057        * 1eafd31 rcu: Avoid redundant grace-period kthread wakeups

#056        * ae15018 rcu: Make call_rcu() leak callbacks for debug-object errors

#049        *---.   be77f87 Merge 'cbnum.2013.06.10a' 'doc.2013.06.10a' 'fixes.2013.06.10a' 'srcu.2013.06.10a' 'tiny.2013.06.10a'

----        |\ \ \  

#048        | | | * 2439b69 rcu: Shrink TINY_RCU by moving exit_rcu()

OBS:

!!!!        1 CPU        - good performance for nocbno, nocbnone, nocball

????        4 CPU        - small performance improvement for nocbno, nocbnone (introduced on commit #014)

----        4 CPU        - no performance improvement for nocball (introduced on commit #007), still 1/2 of 1 cpu performance

### 8th: performance improvement 4 cpus (still not good as before #007): between #091 (3.11.0-rc2) and HEAD (Linus git)

#110        * 322efba Merge branches 'doc.2014.02.24a', 'fixes.2014.02.26a' and 'rt.2014.02.17b' into HEAD

#108        * ffa83fb rcu: Optimize rcu_needs_cpu() for RCU_NOCB_CPU_ALL

#107        * 2f33b51 rcu: Optimize rcu_is_nocb_cpu() for RCU_NOCB_CPU_ALL

#106        * 88c1863 rcu: Define rcu_assign_pointer() in terms of smp_store_release()

#103        * 87de1cf rcu: Stop tracking FSF's postal address

#099        *   0d3c55b Merge 'doc.2013.12.03a' 'fixes.2013.12.12a' 'rcutorture.2013.12.03a' 'sparse.2013.12.12a'

----        |\  

#097        | * 462225ae rcu: Add an RCU_INITIALIZER for global RCU-protected pointers

#095        | * ac7c8e3 rcu: Add comment on evaluate-once properties of rcu_assign_pointer().

----        |/  

#094        | * 24ef659 rcu: Provide better diagnostics for blocking in RCU callback functions

----        |/  

#093        * 4102ada rcu: Move RCU-related source code to kernel/rcu directory

#092        *   2529973 Merge branch 'idle.2013.09.25a' into HEAD

----        |\  

#090        | * 5c173eb rcu: Consistent rcu_is_watching() naming

#089        | * f9ffc31e rcu: Change EXPORT_SYMBOL() to EXPORT_SYMBOL_GPL()

OBS:

!!!!        1 CPU        - good performance for nocbno, nocbnone, nocball

----        4 CPU        - stabilized performance: nocbno, nocbnone, nocball have the same performance over commits