使用增量刷新刷新流数据

  • 版本 :2022.1 及更高版本

注意:从版本 2020.4.1 开始,您现在可以在 Tableau Server 和 Tableau Online 中创建和编辑流。除非特别说明,否则本主题中的内容适用于所有平台。有关在 Web 上创作流的详细信息,请参阅Tableau 在 Web 上准备(链接在新窗口中打开)在 Tableau Server 帮助中。

Tableau Prep Builder 版本 2020.2.1 和 Web 上开始,您可以将流程输入和输出配置为增量刷新,以便在流程运行时仅检索和处理新行,从而节省时间和资源。

例如,如果流包含每天更新的事务数据,则可以将增量刷新设置为每天仅检索和处理新事务,然后每周或每月运行完全刷新以刷新所有流数据。

: 若要对使用 Salesforce 连接器的流输入运行增量刷新,您必须使用 Tableau Prep Builder 版本 2021.1.2 或更高版本。将流输出写入 Microsoft Excel 时,当前不支持增量刷新。

要使用增量刷新运行流程,Tableau Prep 需要以下信息:

  • 检测输入表中新行的字段。

  • 用于将流输出中上次处理的值与输入中的值进行比较以确定哪些行是新行的字段。

  • 您希望如何将新数据写入表。您可以向现有表添加新数据,用新数据覆盖表数据,或者从 Tableau Prep Builder 版本 2020.3.1 开始,在 Web 上替换现有表中的数据。

流刷新选项

通过 Tableau Prep,您可以选择如何刷新数据以及如何使用流输出更新表。下表描述了不同的选项及其优点。

刷新组合处理的数据表更新好处
完全刷新 + 创建表使用完整数据集创建或覆盖现有表。

刷新每个流运行上的所有数据。

完全刷新 + 追加到表向现有表中添加新行。跟踪每次流运行中的新数据和现有数据。“追加到表”不适用于.csv输出类型。
完全刷新 + 替换数据替换现有表中的行。维护现有的表架构结构,但每次流运行时都会替换所有数据。
增量刷新 + 创建表仅新行仅使用新行创建或覆盖现有表。

创建一个仅包含新行作为完整数据集的新表。

增量刷新 + 追加到表仅新行将新行添加到现有表中。仅将新行添加到现有表中。“追加到表”不适用于.csv输出类型。
增量刷新 + 替换数据仅新行仅将现有表中的所有行替换为新行。保留现有的表架构结构,但仅用新行替换所有数据,使其成为完整的数据集。

配置增量刷新

若要将流配置为使用增量刷新,需要在要使用此选项的“输入”步骤和“输出”步骤上指定设置。在“输入”步骤中,指定 Tableau Prep 将如何查找新行。在“输出”步骤中,指定如何将新行写入表。运行流程时,可以选择完全刷新类型或增量刷新类型。

提示: 为增量刷新配置输入和输出步骤后,可以保留配置并重复使用它们。复制并粘贴步骤以在当前流程或 Tableau Prep Builder 中的其他位置使用它们,使用“将步骤另存为流程”将所选步骤保存到本地文件或保存到服务器以在其他流程中重复使用这些步骤。有关复制、粘贴或重用步骤的详细信息,请参阅复制步骤、操作和字段

  1. 在流程窗格中,选择要为增量刷新配置的输入步骤。

  2. “设置”选项卡上的“输入”窗格中的“增量刷新”(在早期版本中设置增量刷新)下,设置以下选项:

    • Select Enable incremental refresh (Enable in prior versions).

    • Input field (Identify new rows using field in prior versions): Select the field that you want to refresh in your input data. This field must be assigned a data type of Number (whole), Date, or Date & Time. Currently, you can only select a single field.

      Note: You can remove or rename this field later in the flow, as long as the field you specify in the Output field (Field name in output in prior versions) can be used to compare this field with the latest output to find new rows.

    • Output: Select the output that is related to your input and that includes the field that will be used to compare rows.

    • Output field (Field name in output in prior versions): Select the field to use to compare the last processed values in the flow output with the values in the input to find new rows. This field must have the same data type as the field you specified in the Input field (Identify new rows using field in prior versions).

Configure write options

To finish setting up incremental refresh, set your output Write Options to specify how the new rows are written to your tables. All outputs that are related to the configured input step have a default write option selected, but you can change it to a supported option.

You can output your rows to a file (Tableau Prep Builder only), a published data source or a database. By default, outputs to local or published .hyper extracts are set to Append to table. Outputs to .csv file types are set to Create table.

  1. In the flow pane, select the output step that you want to configure for incremental refresh.

  2. In the Output pane, in the Write Options section, view the default write option and make any changes as needed.

    • Create table: This option creates a new table or replaces the existing table with the new output.

    • Append to table: This option adds the new data to your existing table. If the table doesn't already exist, a new table is created when the flow is first run and subsequent runs will add new rows to this table. Not available for .csv output types. For more information about supported refresh combinations, see Flow refresh options

    • Replace data (Tableau Prep Builder version 2020.3.1 and later and on the web): This option is available when you want to write your output back to an existing table in a database. It replaces the data in the database table with the flow data, but maintains the table schema structure.

Run your flow

You can run individual flows using incremental refresh in Tableau Prep Builder, on the web, or from the command line. For information about running your flow from the command line, see Run the flow with incremental refresh enabled.

If you have The Data Management Add-on with Tableau Prep Conductor enabled, you can run your flow using incremental refresh using a schedule on Tableau Server or Tableau Online. For information about running your flow on a schedule, see Schedule Flow Tasks(Link opens in a new window) in the Tableau Server help.

Note: In prior version, write options are set in Tableau Prep Builder and can't be changed when running your flow in Tableau Server or Tableau Online. Starting in Tableau Server and Tableau Online version 2020.4, you can edit the flow directly in the web. For more information about using Tableau Prep On the web see see Tableau Prep on the Web(Link opens in a new window) in the Tableau Server help.

Tableau Prep runs a full refresh for all outputs regardless of the run option you select if no existing output is found. Subsequent flow runs use the incremental refresh process and retrieve and process only the new rows unless incremental refresh configuration data is missing or the existing output is removed.

  • To run the flow in Tableau Prep using incremental refresh, select Incremental refresh from one of the following locations:

    • From the top menu, click the drop-down option on the Run button.

    • From the Output pane, click the drop-down option on the Run Flow button.

    • From the Flow pane, click the drop-down on the Run button next to the Output step.

    • 如果启用了增量刷新的一个输入与多个输出相关联,则这些输出必须一起运行,并且必须使用相同的刷新类型。在 Tableau Prep 中运行刷新时,将显示一个对话框,告知您必须同时运行这两个输出。