查看源代码与进程相关的反模式

本文档概述了与进程和基于进程的抽象相关的潜在反模式。

按进程组织代码

问题

此反模式是指不必要地按进程组织代码。进程本身不代表反模式，但它只应用于模拟运行时属性（例如并发、共享资源访问、错误隔离等）。当您将进程用于代码组织时，它会导致系统出现瓶颈。

示例

下面展示了此反模式的一个示例，即通过 GenServer 进程实现算术运算（例如 add 和 subtract）的模块。如果对单个进程的调用次数增加，这种代码组织可能会影响系统性能，从而成为瓶颈。

defmodule Calculator do
  @moduledoc """
  Calculator that performs basic arithmetic operations.

  This code is unnecessarily organized in a GenServer process.
  """

  use GenServer

  def add(a, b, pid) do
    GenServer.call(pid, {:add, a, b})
  end

  def subtract(a, b, pid) do
    GenServer.call(pid, {:subtract, a, b})
  end

  @impl GenServer
  def init(init_arg) do
    {:ok, init_arg}
  end

  @impl GenServer
  def handle_call({:add, a, b}, _from, state) do
    {:reply, a + b, state}
  end

  def handle_call({:subtract, a, b}, _from, state) do
    {:reply, a - b, state}
  end
end

iex> {:ok, pid} = GenServer.start_link(Calculator, :init)
{:ok, #PID<0.132.0>}
iex> Calculator.add(1, 5, pid)
6
iex> Calculator.subtract(2, 3, pid)
-1

重构

在 Elixir 中，如下一所示，代码组织必须仅通过模块和函数来完成。尽可能地，库不应该对用户强加特定的行为（例如并行化）。最好将此行为决策委托给客户端的开发人员，从而提高库代码重用的可能性。

defmodule Calculator do
  def add(a, b) do
    a + b
  end

  def subtract(a, b) do
    a - b
  end
end

iex> Calculator.add(1, 5)
6
iex> Calculator.subtract(2, 3)
-1

分散的进程接口

问题

在 Elixir 中，使用 Agent、GenServer 或任何其他进程抽象本身并不构成反模式。但是，当直接与进程交互的责任分散在整个系统中时，就会出现问题。这种不良做法会增加代码维护难度，使代码更容易出现错误。

示例

以下代码旨在说明这种反模式。直接与 Agent 交互的责任分散在四个不同的模块（A、B、C 和 D）中。

defmodule A do
  def update(process) do
    # Some other code...
    Agent.update(process, fn _list -> 123 end)
  end
end

defmodule B do
  def update(process) do
    # Some other code...
    Agent.update(process, fn content -> %{a: content} end)
  end
end

defmodule C do
  def update(process) do
    # Some other code...
    Agent.update(process, fn content -> [:atom_value | content] end)
  end
end

defmodule D do
  def get(process) do
    # Some other code...
    Agent.get(process, fn content -> content end)
  end
end

这种责任分散会导致代码重复，并使代码维护更加困难。此外，由于缺乏对共享数据格式的控制，可能会共享复杂组合的数据。这种自由使用任何数据格式的做法是危险的，会导致开发人员引入错误。

# start an agent with initial state of an empty list
iex> {:ok, agent} = Agent.start_link(fn -> [] end)
{:ok, #PID<0.135.0>}

# many data formats (for example, List, Map, Integer, Atom) are
# combined through direct access spread across the entire system
iex> A.update(agent)
iex> B.update(agent)
iex> C.update(agent)

# state of shared information
iex> D.get(agent)
[:atom_value, %{a: 123}]

对于 GenServer 和其他行为，当将对 GenServer.call/3 和 GenServer.cast/2 的调用分散在多个模块中，而不是将所有与 GenServer 的交互封装在一个地方时，就会出现这种反模式。

重构

与其将对进程抽象（如 Agent）的直接访问分散在代码中的许多地方，不如通过将与进程交互的责任集中在一个模块中来重构代码。这种重构通过消除重复代码来提高可维护性；它还允许您限制共享数据接受的格式，从而减少易错性。如下一所示，模块 Foo.Bucket 集中负责与 Agent 交互。代码中任何需要访问共享数据的其他地方现在必须将此操作委托给 Foo.Bucket。此外，Foo.Bucket 现在只允许以 Map 格式共享数据。

defmodule Foo.Bucket do
  use Agent

  def start_link(_opts) do
    Agent.start_link(fn -> %{} end)
  end

  def get(bucket, key) do
    Agent.get(bucket, &Map.get(&1, key))
  end

  def put(bucket, key, value) do
    Agent.update(bucket, &Map.put(&1, key, value))
  end
end

以下是将对共享数据（由 Agent 提供）的访问委托给 Foo.Bucket 的示例。

# start an agent through `Foo.Bucket`
iex> {:ok, bucket} = Foo.Bucket.start_link(%{})
{:ok, #PID<0.114.0>}

# add shared values to the keys `milk` and `beer`
iex> Foo.Bucket.put(bucket, "milk", 3)
iex> Foo.Bucket.put(bucket, "beer", 7)

# access shared data of specific keys
iex> Foo.Bucket.get(bucket, "beer")
7
iex> Foo.Bucket.get(bucket, "milk")
3

其他说明

此反模式以前被称为 Agent 迷恋。

发送不必要的数据

问题

如果消息足够大，则向进程发送消息可能是一项昂贵的操作。这是因为该消息将被完全复制到接收进程，这可能会占用大量的 CPU 和内存。这是由于 Erlang 的“无共享”体系结构，每个进程都有自己的内存，这简化了垃圾回收并加快了垃圾回收速度。

这在使用 send/2、GenServer.call/3 或 GenServer.start_link/3 中的初始数据时更加明显。值得注意的是，这也会在使用 spawn/1、Task.async/1、Task.async_stream/3 等时发生。在这里，情况更加微妙，因为传递给这些函数的匿名函数会捕获它引用的变量，所有捕获的变量都将被复制过去。通过这样做，您可能会意外地将比实际需要更多的数据发送到进程。

示例

假设您要实现一些针对应用程序发出请求的 IP 地址的简单报告。您希望异步执行此操作，而不是阻塞处理，因此您决定使用 spawn/1。将整个连接传递过来似乎是个好主意，因为我们可能以后需要更多数据。但是，传递连接会导致复制大量不必要的数据，例如请求正文、参数等。

# log_request_ip send the ip to some external service
spawn(fn -> log_request_ip(conn) end)

此问题也会在只访问相关部分时发生

spawn(fn -> log_request_ip(conn.remote_ip) end)

这仍然会复制整个 conn，因为 conn 变量被捕获在生成的函数中。然后，该函数会提取 remote_ip 字段，但只有在整个 conn 被复制过去之后才会这样做。

send/2 和 GenServer API 也依赖于消息传递。在下面的示例中，conn 再次被复制到底层的 GenServer

GenServer.cast(pid, {:report_ip_address, conn})

重构

此反模式有许多潜在的补救措施

将发送的数据限制为绝对必要的最小值，而不是发送整个结构。例如，如果您只需要几个字段，就不要发送整个 conn 结构。
如果唯一需要数据的进程是您要发送数据的进程，请考虑让该进程获取数据，而不是传递数据。
某些抽象，例如 :persistent_term，允许您在进程之间共享数据，只要这些数据更改频率不高即可。

在我们的例子中，限制输入数据是一个合理的策略。如果我们现在唯一需要的是 IP 地址，那么让我们只处理 IP 地址，并确保我们只将 IP 地址传递到闭包中，如下所示

ip_address = conn.remote_ip
spawn(fn -> log_request_ip(ip_address) end)

或在 GenServer 案例中

GenServer.cast(pid, {:report_ip_address, conn.remote_ip})

无监督进程

问题

在 Elixir 中，在监督树之外创建进程本身并不构成反模式。但是，当您在监督树之外生成许多长时间运行的进程时，这会使这些进程的可见性和监控变得困难，从而阻止开发人员完全控制他们的应用程序。

示例

以下代码示例旨在说明一个负责通过监督树之外的 GenServer 进程维护数值 Counter 的库。客户端可以同时创建多个计数器（每个计数器一个进程），这使得这些无监督进程难以管理。这会导致系统初始化、重启和关闭出现问题。

defmodule Counter do
  @moduledoc """
  Global counter implemented through a GenServer process.
  """

  use GenServer

  @doc "Starts a counter process."
  def start_link(opts \\ []) do
    initial_value = Keyword.get(opts, :initial_value, 0)
    name = Keyword.get(opts, :name, __MODULE__)
    GenServer.start(__MODULE__, initial_value, name: name)
  end

  @doc "Gets the current value of the given counter."
  def get(pid_name \\ __MODULE__) do
    GenServer.call(pid_name, :get)
  end

  @doc "Bumps the value of the given counter."
  def bump(pid_name \\ __MODULE__, value) do
    GenServer.call(pid_name, {:bump, value})
  end

  @impl true
  def init(counter) do
    {:ok, counter}
  end

  @impl true
  def handle_call(:get, _from, counter) do
    {:reply, counter, counter}
  end

  def handle_call({:bump, value}, _from, counter) do
    {:reply, counter, counter + value}
  end
end

iex> Counter.start_link()
{:ok, #PID<0.115.0>}
iex> Counter.get()
0
iex> Counter.start_link(initial_value: 15, name: :other_counter)
{:ok, #PID<0.120.0>}
iex> Counter.get(:other_counter)
15
iex> Counter.bump(:other_counter, -3)
12
iex> Counter.bump(Counter, 7)
7

重构

为了确保库的客户端能够完全控制他们的系统，无论使用多少进程以及每个进程的生存期如何，所有进程都必须在监督树内启动。如下一所示，此代码使用 Supervisor 作为监督树。当此 Elixir 应用程序启动时，两个不同的计数器（Counter 和 :other_counter）也作为名为 App.Supervisor 的 Supervisor 的子进程启动。一个用 0 初始化，另一个用 15 初始化。通过这种监督树，可以管理所有子进程的生命周期（停止或重启每个子进程），提高整个应用程序的可见性。

defmodule SupervisedProcess.Application do
  use Application

  @impl true
  def start(_type, _args) do
    children = [
      # With the default values for counter and name
      Counter,
      # With custom values for counter, name, and a custom ID
      Supervisor.child_spec(
        {Counter, name: :other_counter, initial_value: 15},
        id: :other_counter
      )
    ]

    Supervisor.start_link(children, strategy: :one_for_one, name: App.Supervisor)
  end
end

iex> Supervisor.count_children(App.Supervisor)
%{active: 2, specs: 2, supervisors: 0, workers: 2}
iex> Counter.get(Counter)
0
iex> Counter.get(:other_counter)
15
iex> Counter.bump(Counter, 7)
7
iex> Supervisor.terminate_child(App.Supervisor, Counter)
iex> Supervisor.count_children(App.Supervisor) # Only one active child
%{active: 1, specs: 2, supervisors: 0, workers: 2}
iex> Counter.get(Counter) # The process was terminated
** (EXIT) no process: the process is not alive...
iex> Supervisor.restart_child(App.Supervisor, Counter)
iex> Counter.get(Counter) # After the restart, this process can be used again
0

← 上一页与设计相关的反模式

下一页 → 与元编程相关的反模式

查看源代码 与进程相关的反模式