☕️【Java 技术之旅】站在 Linux 操作系统角度去看 Thread（线程）

关注

发布于: 2021 年 06 月 09 日

Linux 进程与线程

无论是 Java 还是其他语言，无论如何定义线程模型和实现，基于底层角度而言都要归属到操作系统层面上的线程（LWP：轻量级线程技术映射到了内核线程）概念就不提了。

Richard Stevens 对线程的描述（原文）

fork is expensive. Memory is copied from the parent to the child, all descriptors are duplicated in the child, and so on. Current implementations use a technique called copy-on-write, which avoids a copy of the parent’s data space to the child until the child needs its own copy. But, regardless of this optimization, fork is expensive. IPC is required to pass information between the parent and child after the fork. Passing information from the parent to the child before the fork is easy, since the child starts with a copy of the parent’s data space and with a copy of all the parent’s descriptors. But, returning information from the child to the parent takes more work. Threads help with both problems. Threads are sometimes called lightweight processes since a thread is “lighter weight” than a process. That is, thread creation can be 10–100 times faster than process creation. All threads within a process share the same global memory. This makes the sharing of information easy between the threads, but along with this simplicity comes the problem.

Richard Stevens 对线程的描述（中文）

Linux 中创建进程用 fork 操作，线程用 clone 操作。
通过 ps -ef 看到的是进程列表，线程可以通过 ps -eLf 来查看。
用 top 命令的话，通过 H 开关也可以切换到线程视图。
具体到 Java 线程模型，规范是没有规定 Java 线程和系统线程的对应关系的，不过目前常见的实现是一对一的。

参考

http://openjdk.java.net/groups/hotspot/docs/RuntimeOverview.html#Thread%20Management|outline

问题排查思路

如果创建不了 Java 线程，报错是

Exception in thread “main” java.lang.OutOfMemoryError: unable to create new native thread

复制代码

下面是常见的问题原因

内存太小

在 Java 中创建一个线程需要消耗一定的栈空间，默认的栈空间是 1M(可以根据应用情况指定-Xss 参数进行调整)，栈空间过小或递归调用过深，可能会出现 StackOverflowError。

对于一个进程来说，假设一定量可使用的内存，分配给堆空间的越多，留给栈空间的就越少。

这个限制常见于 32 位 Java 应用，进程空间 4G，用户空间 2G(Linux 下 3G，所以通常堆可以设置更大些)。
减去堆空间大小(通过-Xms、-Xmx 指定范围)
减去非堆空间(其中永久代部分通过 PermSize、MaxPermSize 指定大小，在 Java8 换成了 MetaSpace，默认不限制大小)。
再减去虚拟机自身消耗。
剩下的就是栈空间，假设剩下 300M，那么理论上就限制了只能开 300 线程（-Xss1M）。

不过对于 64 位应用，由于进程空间近乎无限大，所以可以不考虑这个问题。

ulimit 限制

线程数还会受到系统限制，系统限制通过 ulimit -a 可以查看到。

https://ss64.com/bash/ulimit.html

caixj@Lenovo-PC:~$ ulimit -a
core file size          (blocks, -c) 0
data seg size          (kbytes, -d) unlimited
scheduling priority            (-e) 0
file size              (blocks, -f) unlimited
pending signals                (-i) 7823
max locked memory      (kbytes, -l) 64
max memory size        (kbytes, -m) unlimited
open files                      (-n) 1024
pipe size            (512 bytes, -p) 8
POSIX message queues    (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time              (seconds, -t) unlimited
max user processes              (-u) 7823
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

复制代码

相关的限制有

max memory size ：最大内存限制，在 64 位系统上通常都设置成 unlimited
max user processes ：每用户总的最大进程数(包括线程)
virtual memory - 虚拟内存限制，在 64 位系统上通常都设置成 unlimited

这些参数可以通过 ulimit 命令(当前用户临时生效)或者配置文件/etc/security/limits.conf(永久生效)进行修改。检查某个进程的限制是否生效，可以通过/proc/PID/limits 查看运行时状态。

参数 sys.kernel.threads-max 限制

https://www.kernel.org/doc/Documentation/sysctl/kernel.txt
This value controls the maximum number of threads that can be created
using fork().
During initialization the kernel sets this value such that even if the
maximum number of threads is created, the thread structures occupy only
a part (1/8th) of the available RAM pages.
The minimum value that can be written to threads-max is 20.
The maximum value that can be written to threads-max is given by the
constant FUTEX_TID_MASK (0x3fffffff).
If a value outside of this range is written to threads-max an error
EINVAL occurs.
The value written is checked against the available RAM pages. If the
thread structures would occupy too much (more than 1/8th) of the
available RAM pages threads-max is reduced accordingly.

复制代码

表示系统全局的总线程数限制。设置方式有:

运行时限制,临时生效

echo 999999 > /proc/sys/kernel/threads-max

修改/etc/sysctl.conf，永久生效

sys.kernel.threads-max = 999999

参数 sys.kernel.pid_max 限制

https://www.kernel.org/doc/Documentation/sysctl/kernel.txt
PID allocation wrap value.  When the kernel's next PID value
reaches this value, it wraps back to a minimum PID value.
PIDs of value pid_max or larger are not allocated.

复制代码

表示系统全局的 PID 号数值的限制。设置方式有:

运行时限制,临时生效

echo 999999 > /proc/sys/kernel/pid_max

修改/etc/sysctl.conf，永久生效

sys.kernel.pid_max = 999999

参数 sys.vm.max_map_count 限制

https://www.kernel.org/doc/Documentation/sysctl/vm.txt
This file contains the maximum number of memory map areas a process
may have. Memory map areas are used as a side-effect of calling
malloc, directly by mmap, mprotect, and madvise, and also when loading
shared libraries.
While most applications need less than a thousand maps, certain
programs, particularly malloc debuggers, may consume lots of them,
e.g., up to one or two maps per allocation.
The default value is 65536.

复制代码

表示单个进程所能使用内存映射空间的数量限制。设置方式有:

运行时限制,临时生效

echo 999999 > /proc/sys/vm/max_map_count

修改/etc/sysctl.conf，永久生效

sys.vm.max_map_count = 999999

在其他资源可用的情况下，单个 vm 能开启的最大线程数是这个值的一半，可以通过cat /proc/PID/maps | wc -l查看目前使用的映射数量。

至于为什么只有一半，结合一些材料和源码分析了一下:

常见的警告信息是这样的，见

JavaThread::create_stack_guard_pages()Attempt to protect stack guard pages failed.Attempt to deallocate stack guard pages failed.

复制代码

见 current_stack_region()的图示，结合一下 R 大的相关解释:http://hllvm.group.iteye.com/group/topic/37717

如下所示，通常的 Java 线程，会包括一个 glibc 的 guard page 和 HotSpot 的 guard pages，其中 JavaThread::create_stack_guard_pages()就是创建 HotSpot Guard Pages 用的，这里正常应该会有 2 次 VMA，所以最大值只能有一半，从/proc/PID/maps 中也可以看到增加一个线程会增加 2 个地址相连的映射空间。

// Java thread:
//
//  Low memory addresses
//    +------------------------+
//    |                        |\  JavaThread created by VM does not have glibc
//    |    glibc guard page    | - guard, attached Java thread usually has
//    |                        |/  1 page glibc guard.
// P1 +------------------------+ Thread::stack_base() - Thread::stack_size()
//    |                        |\
//    |  HotSpot Guard Pages  | - red and yellow pages
//    |                        |/
//    +------------------------+ JavaThread::stack_yellow_zone_base()
//    |                        |\
//    |      Normal Stack      | -
//    |                        |/
// P2 +------------------------+ Thread::stack_base()
//
// Non-Java thread:
//
//  Low memory addresses
//    +------------------------+
//    |                        |\
//    |  glibc guard page      | - usually 1 page
//    |                        |/
// P1 +------------------------+ Thread::stack_base() - Thread::stack_size()
//    |                        |\
//    |      Normal Stack      | -
//    |                        |/
// P2 +------------------------+ Thread::stack_base()
//
// ** P1 and size ( P2 = P1 - size) are the address and stack size returned from
//    pthread_attr_getstack()

复制代码

发布于: 2021 年 06 月 09 日阅读数: 25

原文链接:【http://xie.infoq.cn/article/8edad0e348e68ad39ff8058fc】。文章转载请联系作者。

李浩宇/Alex

关注

我们始于迷惘，终于更高水平的迷惘。 2020.03.25 加入

🏆 【酷爱计算机技术、醉心开发编程、喜爱健身运动、热衷悬疑推理的”极客狂人“】 🏅 【Java技术领域，MySQL技术领域，APM全链路追踪技术及微服务、分布式方向的技术体系等】 🤝未来我们希望可以共同进步🤝

发布

暂无评论

创作场景

☕️【Java 技术之旅】站在 Linux 操作系统角度去看 Thread（线程）

Linux 进程与线程

Richard Stevens 对线程的描述（原文）

Richard Stevens 对线程的描述（中文）

参考

问题排查思路

下面是常见的问题原因

内存太小

ulimit 限制

李浩宇/Alex

评论