使用数据角色验证数据

  • 版本 :2022.1 及更高版本

注: 数据源所有者和 Tableau 管理员可以为特定数据字段名称添加同义词,并为“数据问答”添加值。有关将数据角色用于“数据问答”的信息,请参阅为“数据问答”添加同义词(链接在新窗口中打开)在 Tableau 桌面帮助中。

使用数据角色快速识别字段中的值是否有效。Tableau Prep 提供了一组标准的数据角色,您可以从中进行选择,也可以使用数据集中的唯一字段值创建自己的角色。

分配数据角色时,Tableau Prep 会将为数据角色定义的标准值与字段中的值进行比较。任何不匹配的值都标有红色感叹号。您可以筛选字段以仅查看有效或无效值,并采取适当的操作来修复它们。为字段分配数据角色后,可以使用“组值”选项根据拼写和发音对无效值进行分组并将其与有效值进行匹配。

注意:从版本 2020.4.1 开始,您现在可以在 Tableau ServerTableau Online 中创建和编辑流。除非特别说明,否则本主题中的内容适用于所有平台。有关在 Web 上创作流的详细信息,请参阅Tableau 在 Web 上准备(链接在新窗口中打开)在 Tableau Server 帮助中。

为数据分配标准数据角色

将 Tableau Prep 提供的数据角色分配给您的字段,方法与分配数据类型的方式相同。数据角色标识您的数据值所表示的内容,以便 Tableau Prep 可以自动验证值并突出显示对该角色无效的值。

例如,如果您有地理数据的字段值,则可以分配“城市”的数据角色,Tableau Prep 会将字段中的值与一组已知域值进行比较,以标识不匹配的值。

注意:每个字段都是独立分析的,因此国家/地区“美国”中州“华盛顿”中的“波特兰”城市值可能不是有效的城市和州组合,但不会以这种方式标识,因为它是有效的城市名称。

Tableau Prep Builder 提供以下数据角色:

  • 电子邮件

  • 网址

  • 地理角色(基于当前地理数据,与 Tableau Desktop 使用的数据相同)

    • 飞机场

    • 地区代码(美国)

    • CBSA/MSA

    • 城市

    • 国会选区(美国)

    • 国家/地区

    • 坚果欧洲

    • 州/省

    • 邮政编码

Tip: In Tableau Prep Builder version 2019.1.4 and later and on the web, if you assign a geographic role to a field, you can also use that data role to match and group values with the standard value defined by your data role. For more information about grouping values using data roles, see Clean and Shape Data(Link opens in a new window).

To assign a data role to a field, do the following:

  1. In the Profile pane, Results pane or data grid, click the data type for the field.

  2. Select the data role for the field.

    Tableau Prep compares the field's data values to known domain values or patterns (for email or URL) for the data role you select and marks any values that don't match with a red exclamation point.

  3. Click the drop-down arrow for the field and from the Show Values section select an option to show all values or only values that are valid or not valid for the data role.

  4. Use the cleaning options on the More optionsmenu for the field to correct any values that aren't valid. For more information about how to clean your field values see About cleaning operations(Link opens in a new window).

Create custom data roles

Starting in Tableau Prep Builder version 2019.3.1 and on the web, you can create your own custom data roles using the field values in your data sets to create a standard set of values that you or others can then use to validate fields when cleaning data. Select the field that you want to use, apply any cleaning operations to it if needed, then, publish it to Tableau Server or Tableau Online to use it in your flow or share your data roles with others.

If creating custom data roles when editing flows on the web, you can publish the custom data role directly to the server you are signed into.

Requirements

  • You can create custom data roles from single fields in your data set. Creating custom data roles from a combination of fields isn't supported.

  • You can create custom data roles only for fields assigned to a data type of String and Number (whole).

  • When you create a custom data role, Tableau Prep creates an output step in your flow that is specific to publishing the data role.

  • Publishing custom data roles to multiple sites in the same flow isn't supported. If you publish the flow, you must publish the custom data role to the same site or server where the flow is published.

  • Custom data roles are specific to the site, server and project where you publish them. All users with permissions to the location can use the custom data role, but must be signed into the site or server to select it or apply it. Custom data roles are assigned the default permission for the All Users group for new projects instead of None.

  • Custom data roles aren't version specific. When applying a custom data role, the most current version is applied.

  • Once published to Tableau Server or Tableau Online user with access to the site, server and project can view all data roles in that location.

  • To edit a data role, you must make your changes in Tableau Prep Builder or in the flow on the web, then republish the data role using the same name to overwrite it. This process is similar to editing a published data source.

Create a custom data role

  1. In the Profile pane, data grid, or Results pane select the field you want to use to create a custom data role.

  2. Click More options for the field, and select Publish as Data Role.

  3. Select the server and project where you want to publish the data role.

  4. Click Run Flow to create the data role. After the publishing process completes successfully, you can view your data role in Tableau Server or Tableau Online. Processing the data role can take some time based on the load on your Tableau Server or Tableau Online site. If your data role isn't available right away, wait a few minutes, then try selecting it again.

Apply a custom data role

  1. In the Profile pane, Results pane or data grid, click the data type for the field where you want to apply the custom data role.

  2. Select Custom then select the data role that you want to apply to the field.

    Important: In Tableau Prep Builder, make sure you are signed into the site or server where the data role was published or you won't see this option.

    Tableau Prep compares the field's data values to known domain values for the data role you select and marks any values that don't match with a red exclamation point.

  3. Click the drop-down arrow for the field and from the Show Values section select an option to show all values or only values that are valid or not valid for the data role.

  4. Use the cleaning options on the More optionsmenu for the field to correct any values that aren't valid. For more information about how to clean your field values see About cleaning operations(Link opens in a new window).

View and manage custom data roles

You can view and manage your published custom data roles on Tableau Server and Tableau Online. You can view all custom data roles published to your site or server. Click More actions for a selected data role to move it to a different project, change permissions or delete it.

Group similar values by data role

Note: In Tableau Prep Builder version 2019.1.4 and 2019.2.1 this option was labeled Data Role Matches.

If you assign a geographic data role to a field you can use the values in the data role to group and match values in your data field based on spelling and pronunciation to standardize them. You can use either Spelling or Spelling + Pronunciation to group and match invalid values to valid ones.

These options uses the standard value defined by the data role. If the standard value isn't in your data set sample, Tableau Prep adds it automatically and marks the value as not in the original data set. For more information about assigning data roles to fields, see Assign standard data roles to your data.

To use data roles to group values, complete the following steps.

  1. In the Profile pane, Results pane or data grid, click the data type for the field.

  2. Select one of the following data roles for the field:

    Starting in Tableau Prep Builder version 2019.3.2 and on the web, you can also select from your custom data roles.

    Standard data roles (version 2019.1.4 and later)Custom data roles (version 2019.3.2 and later)

    Tableau Prep compares the field's data values to known domain values for the data role you select and marks any values that don't match with a red exclamation point.

    • Airport

    • City

    • Country/Region

    • County

    • State/Province

  3. Click More options, select Group Values (Group and Replace in previous versions), then select one of the following options:

    您还可以单击字段上的“建议”图标,将建议应用于分组,并将无效值替换为有效值。此选项使用“发音 + 拼写检查组值”选项。

    Tableau Prep 按拼写或拼写和发音比较这些值,然后将相似的值分组到数据角色的标准化值下。如果数据集中不包含标准化值,则会添加该值并用红点标记。

    • Spelling: Matches invalid values to the closest valid values that differ by adding, removing, or substituting characters.

    • Pronunciation + Spelling: Matches invalid values to the most similar valid value based on spelling and pronunciation.