Node child_process 学习笔记

date

Feb 16, 2023

slug

node-child-process

status

Published

源码解读解决的问题

exec 和 execFile 到底有什么区别

为什么 exec / execFile / fork 都是通过 spawn 实现的，spawn 的作用到底是什么？

为什么 spawn 调用后没有回调，而 exec / execFile 能够回调？

为什么 spawn 调用后需要手动调用 child.stdout.on('data', callback)，这里的 child.stdio / child.stderr 到底是什么？

为什么有 data / error / exit / close 这么多种回调，它们的执行顺序到底是什么怎样的？

exec 源码深入分析

child_process

exec
execFile
spawn

internal/child_process

ChildProcess
spawn

断点打在 exec 处，单步跳过

进入 execFile

进入 spawn

ChildProcess 来自const child_process = require('internal/child_process');

在ChildProcess 中使用 Process 创建进程并且给实例_handle 挂载监听函数 onexit （即监听进程代码执行结束时机）

Process 来自const { Process } = internalBinding('process_wrap'); （node 源码 src/process_wrap.cc）

一层层返回至 execFile 函数最底部

此时监听创建出的进程的 close/error 事件（由 EventEmitter 创建）

当上述 4 中的 onexit 被 c++ 触发，执行 this._handle.close() 结束通信（c++处理）。之后在onexit 底部调用 maybeClose(subprocess)，maybeClose 中执行 subprocess.emit('close', subprocess.exitCode, subprocess.signalCode) 通知 execFile 底部监听的 close 事件

进入 execFile 中监听的 close 事件处理函数 exithandler 中，将 stdin/stdout/stderr 封装好调用传入的 callback 返回

本地调试 node

7892 这个 pid 也可以在 child 的对象属性 _handler.pid 或者pid看到。

`shell` 的使用

方法一：直接执行shell文件

/bin/sh test.shell

方法二：直接执行 shell 语句（类似 node -e "console.log(123)"）

/bin/sh -c "ls -la"

不带 -c 参数就要指定文件（输入文件路径）
shell 命令 ls -la === /bin/sh -c "ls -la" （这一步是操作系统帮忙做的，一般简写即可）

`exec` 源码精读

对象的扩展运算符进行浅拷贝

// 等同于 {...Object(true)}
{...true} // {}

// 等同于 {...Object(undefined)}
{...undefined} // {}

// 等同于 {...Object(null)}
{...null} // {}

{...'hello'}
// {0: "h", 1: "e", 2: "l", 3: "l", 4: "o"}

{ ...['a', 'b', 'c'] };
// {0: "a", 1: "b", 2: "c"}

浅拷贝和深拷贝

浅拷贝：创建一个新对象，这个对象有着原始对象属性值的一份精确拷贝。如果属性是基本类型，拷贝的就是基本类型的值，如果属性是引用类型，拷贝的就是内存地址，所以如果其中一个对象改变了这个地址，就会影响到另一个对象。
深拷贝：将一个对象从内存中完整的拷贝一份出来，包括属性指向的引用类型，从堆内存中开辟一个新的区域存放新对象，且修改新对象不会影响原对象

注意：第二个参数传任何非 function 类型，都会产生获得一个对象。

function normalizeExecArgs(command, options, callback) {
  if (typeof options === 'function') {
    callback = options;
    options = undefined;
  }

  // 浅拷贝
  options = { ...options }; // 将任意非 function 都将转化为参数
  options.shell = typeof options.shell === 'string' ? options.shell : true; // 得到shell属性。

  return {
    file: command,
    options: options, // options 至少有一个属性：shell
    callback: callback
  };
}

option.shell 可以是一个字符串，用来执行命令的文件。默认值: Unix 上是 '/bin/sh'，Windows 上是 process.env.ComSpec

execFile 中首先对参数逐个判断，判断逻辑有点意思

function execFile(file /* , args, options, callback */) {
  let args = [];
  let callback;
  let options;

  // 解析可选参数（第一个参数是 shell 文件路径），使用argument
  let pos = 1;
  if (pos < arguments.length && Array.isArray(arguments[pos])) { // 获得传入shell文件的参数
    args = arguments[pos++];
  } else if (pos < arguments.length && arguments[pos] == null) { // 第二个参数给 null 跳过第二个参数解析
    pos++;
  }

  if (pos < arguments.length && typeof arguments[pos] === 'object') { // 参数是 Object 类型，认为是options
    options = arguments[pos++];
  } else if (pos < arguments.length && arguments[pos] == null) { // 参数值是 null，跳过
    pos++;
  }

  if (pos < arguments.length && typeof arguments[pos] === 'function') { // 获得回调函数
    callback = arguments[pos++];
  }

  if (!callback && pos < arguments.length && arguments[pos] != null) { // 经过以上步骤，传参了但没有解析到回调函数，报错。
    throw new ERR_INVALID_ARG_VALUE('args', arguments[pos]);
  }
  ...
}

这样的参数解析，可以不用固定参数的顺序

数组的浅拷贝

//args = args.slice(0)
var a = [1, 2, 3];
var b = a.slice(0); // b: [1, 2, 3] => b = [...a]
a === b; // false

spawn 中的命令拼接部分

if (options.shell) {
  const command = [file].concat(args).join(' '); // 拼接命令文件和传入的参数
  // Set the shell, switches, and commands.
  if (process.platform === 'win32') { // windows
    if (typeof options.shell === 'string') // 自定义执行shell的文件
      file = options.shell;
    else
      file = process.env.comspec || 'cmd.exe';
    // '/d /s /c' is used only for cmd.exe.
    if (/^(?:.*\\)?cmd(?:\.exe)?$/i.test(file)) { // 匹配任意路径下的 cmd.exe。这里指定了 cmd.exe 的路径
      args = ['/d', '/s', '/c', `"${command}"`]; // '/d /s /c' 仅用于 cmd.exe.
      options.windowsVerbatimArguments = true; // options 中的 windowsVerbatimArguments 参数
    } else {
      args = ['-c', command];
    }
  } else {
    if (typeof options.shell === 'string')
      file = options.shell;
    else if (process.platform === 'android') // 安卓系统
      file = '/system/bin/sh';
    else
      file = '/bin/sh'; // 默认使用 '/bin/sh'
    args = ['-c', command];
  }
}

spawn 中的 new ChildProcess()

EventEmitter.call(this); 之后，可以分发事件了。

emit 分发
on 监听

this._handle.onexit 进程执行完之后回调
child.spawn/ ChildProcess.prototype.spawn

getValidStdio() 创建输入/输出/错误流（处理函数挂在 handle 属性上）

输入流，子进程只有读权限
输出流，子进程只有写权限
new Pipe() 创建 socket 通信，调用 pipe_wrap
ipc ipcFd 建立进程间的双向通信，在 fork 时创建

for (i = 0; i < stdio.length; i++) 循环建立父子进程 socket 通信

socket 对象使用 on('data')监听
将三个 socket 对象分别绑定在当前实例的 stdin/stdout/stderr 上，所以 spawn 可以调用

`child_process`回调调用流程

Process 执行命令

child._handle.spawn(options) 执行命令
exitCode 为 0，表示执行成功，小于0表示失败

命令执行成功后，往”流“中写入信息，回调 onStreamRead 方法读取流中信息

onStreamRead 每读取完一条流中信息，调用一次 onReadableStreamEnd

maybeClose() 中，判断所有socket 关闭后，关闭子进程

两条线：

子进程的执行线
流的读取线

onStreamRead 与 onReadableStreamEnd 均由 node net.Socket 提供，socket 管道数据变化的监听均有此 api 内部通过广播 emit 触发，而底层均由 pipe_wrap.cc 触发后通知给onStreamRead 与 onReadableStreamEnd

事件处理函数执行顺序

const child = exec('ls -al | grep node_modules', (err, stdout, stderr) => {
  // 回调中的数据有 execFile 封装一次性把数据抛出，成为统一标准流
  console.log(
    '//:============================== exec callback ==============================://'
  )
  console.log(err)
  console.log(stdout)
  console.error(stderr)
  console.log(
    '//:============================== exec callback ==============================://'
  )
})

// 下列代码直接监听 socket 的数据变化，数据按 chunk 来接收
child.stdout?.on('data', (chunk) => {
  console.log('stdout data', chunk)
})

child.stdout?.on('close', () => {
  console.log('stdout close')
})

child.stderr?.on('data', (chunk) => {
  console.log('stderr data', chunk)
})

child.stderr?.on('close', () => {
  console.log('stderr close')
})

// 'exit' 事件在子进程结束后触发。
// 当 'exit' 事件被触发时，子进程标准输入输出流可能仍处于打开状态。
// 会调用 maybeclose 执行上面的 callback
// 此后向 socket 广播 close 事件
//（如下 console.log('close!', code)）先接受到，
// 随后 stdout 也接受到
child.on('exit', (code) => {
  console.log('exit!', code)
})

// 在进程已结束并且子进程的标准输入输出流已关闭之后，则触发 'close' 事件
// 这与 'exit' 事件不同，因为多个进程可能共享相同的标准输入输出流。 
// 'close' 事件将始终在 'exit' 或 'error'（如果子进程衍生失败）已经触发之后触发。
child.on('close', (code) => {
  console.log('close!', code)
})

stdout data drwxr-xr-x  39 i7eo  staff    1248  1 31 18:48 node_modules

exit! 0
stderr close
//:============================== exec callback ==============================://
null
drwxr-xr-x  39 i7eo  staff    1248  1 31 18:48 node_modules


//:============================== exec callback ==============================://
close! 0
stdout close

Node 多进程源码总结

exec/execFile/spawn/fork的区别

exec : 原理是调用 bin/shell -c 执行我们传入的 shell 脚本，调用 execFile，但传参做了处理
execFile：原理是直接执行我们传入的 file 和 args，底层调用 spawn 创建和执行子进程，但通过监听 spawn 中广播的事件，建立了回调，且一次性将所有的 stdout 和 stderr结果返回
spawn：原理是调用 internal/child_process，实例化了 ChildProcess 子进程对象，再调用 ChildProcess.prototype.spawn() 创建子进程并执行命令，底层调用了 child._handle.spawn() 执行 C++ process_wrap 中的 spawn 方法。执行过程是异步的。执行完后，通过 pipe 进行单向数据通信，通信结束后，子进程发起 child._handle.onexit 回调，同时 socket 会执行 close 回调。
fork：原理是通过 spawn 创建子进程和执行命令。使用 node 执行命令，通过 setupchannel 创建 IPC 用于子进程和父进程之间的双向通信

data/error/exit/close回调的区别

data：主进程读取数据过程中，通过 onStreamRead 发起回调
error：命令执行失败后发起的回调
exit：子进程关闭完成后发起的回调
close：子进程所有 Socket 通信全部关闭后发起的回调
stdout close/stderr close：特定的 PIPE 读取完成后调用 onReadableStreamEnd() 关闭 Socket 时发起的回调。