Nexus Data Import

Walk one or more directories or files on your local disk and create data items from the file contents. Those items are pushed into a running Nexus database.

Usage

nexus_data_import [-h] [-server [server]]

                  [-user [username]] [-password [password]]

                  [-session_guid [session_guid]]

                  [-session_date [date]]

                  [-session_hostname [hostname]]

                  [-session_platform [platform]]

                  [-session_application [application]]

                  [-session_version [version]]

                  [-session_tags [tag_string]]

                  [-dataset_guid [dataset_guid]]

                  [-dataset_filename [filename]]

                  [-dataset_dirname [directory]]

                  [-dataset_format [format]]

                  [-dataset_parts [count]]

                  [-dataset_elements [count]]

                  [-dataset_tags [tag_string]]

                  [-data_date [date]]

                  [-data_sequence [number]]

                  [-data_name [name]]

                  [-data_source [name]]

                  [-data_tags [tag_string]]

                  [-data_categories [categoryA [categoryB ...]]

                  [-disable_encoding_check]

                  [-ext_files [ext [ext ...]]]

                  [-ext_tables [ext [ext ...]]]

                  [-ext_html [ext [ext ...]]]

                  [-ext_movies [ext [ext ...]]]

                  [-ext_images [ext [ext ...]]]

                  [-ext_text [ext [ext ...]]]

                  [-ext_scenes [ext [ext ...]]]

                  [-table_col_labels] [-table_row_labels] [-table_numeric]

                  [-table_delimiter [char]] [-table_quote_char [char]]

                  [-verbose] [-defaults] [-simulate]

                  [file|directory [file|directory ...]]

Description

The nexus_data_import tool allows a user to upload data from files into a Nexus database. This is similar to the "Upload files into data items" application in the Nexus web interface. In addition to specifying files or directories to upload, many options for defining exactly how the data should be incorporated into the existing data at the Nexus server are provided. These include specification of session, dataset, tags, what data item type-specific file extensions should be uploaded as, and how tables should be handled.

Item file uploads for files of type 'csv, txt and html' now have to be strictly UTF-8 encoded. On upload, there is a check for UTF-8 encoding which is used to force this requirement. This check is enabled by default and uploads will fail if this check fails. You can disable this check using -disable_encoding_check if you feel that your file is correctly encoded.

The tool takes a list of files and/or directories as arguments. When a directory is specified, the tool will recursively visit its contents, uploading all files that match the file extensions specified with the -ext_* options.

WARNING: The password command-line option is passed to nexus_data_import in cleartext. Such command lines are visible to all users running on the same system. Use with care.

Options

If any of these options are specified multiple times on the command line, only the last one takes effect.

  • -h, --help
    Show the usage message and exit
  • -server [server]
    The URL of the Nexus server to push data to
  • -user [username]
    The login name of the user to authenticate as. Uploads to a Nexus server are restricted to authorized users only. The default username is "nexus".
  • -password [password]
    The password of the user to authenticate as. The password is given as cleartext on the command line and is visible to all users on the same system as the nexus_data_import tool is run on. The default password is "cei".
  • -session_guid [GUID]
    The Globally Unique IDentifier (GUID) of the session to associated the data with. If omitted, a new session is created and assigned a new GUID. This new session GUID is then associated with the data. The additional -session_* command-line options are provided to specify attributes of the new session.
  • -session_date [date]
    The creation date for the created session. This option is ignored if an existing session GUID is used instead of creating a new session. If omitted, the current time stamp is used. The format of the date is that parsed by the Python dateutil.parser.parse method, which is able to parse most known formats.
  • -session_hostname [hostname]
    The host name to associate with the created session. This option is ignored if an existing session GUID is specified. The default is the host name of the system the nexus_data_import tool is run on.
  • -session_platform [platform]
    The platform (operating system) name to associate with the created session. This option is ignored if an existing session GUID is specified. The default is the platform of the system the nexus_data_import tool is run on.
  • -session_application [application]
    The application name to associate with the created session. This option is ignored if an existing session GUID is specified. The default is "nexus_data_import".
  • -session_version [version]
    The version to associate with the created session. This option is ignored if an existing session GUID is specified. The default is the version number of the nexus_data_import tool.
  • -session_tags [tag string]
    A space-separated list of tags to associate with the created session. This option is ignored if an existing session GUID is specified. The default is empty (no session tags). Note that session tags, dataset tags, and data item tags are distinct and separate. It's a common mistake to specify -session_tags when -data_tags is intended.
  • -dataset_guid [GUID]
    The GUID of the dataset to associate the data with. If omitted, a new dataset is created and assigned a new GUID. This new dataset GUID is then associated with the data. The additional -dataset_* command-line options are provided to specify attributes of the new dataset.
  • -dataset_filename [file name]
    The file name for the created dataset. This option is ignored if an existing dataset GUID is specified. The default file name is "unspecified".
  • -dataset_dirname [directory name]
    The directory name for the created dataset. This option is ignored if an existing dataset GUID is specified. The default directory name is the current working directory when the nexus_data_import tool was run.
  • -dataset_format [format]
    The format for the created dataset as a string. This option is ignored if an existing dataset GUID is specified. The default format string is "unspecified".
  • -dataset_parts [count]
    The number of parts in the created dataset. This is only useful when 3D scene data is imported. This option is ignored if an existing dataset GUID is specified. The default is 0.
  • -dataset_elements [count]
    The number of elements in the created dataset. This is only useful when 3D scene data is imported. This option is ignored if an existing dataset GUID is specified. The default is 0.
  • -dataset_tags [tag string]
    A space-separated list of tags to associate with the created dataset. This option is ignored if an existing dataset GUID is specified. The default is empty (no dataset tags). Note that session tags, dataset tags, and data item tags are distinct and separate. It's a common mistake to specify -dataset_tags when -data_tags is intended.
  • -data_date [date]
    The creation date for the created data items. The format of the date is that parsed by the Python dateutil.parser.parse method, which is able to parse most known formats. If the string "__timestamp__" is specified for the date, the creation date used will be that of each data file. The default is now, the time when the nexus_data_import tool was run.
  • -data_sequence [sequence number]
    The sequence number for the created data items. This is used if data items can be considered in a specific ordered sequence. The default sequence number is 0.
  • -data_name [name]
    The name for the created data items as a string. If "__file__" is specified, the name will be that of each data file. The default is "__file__".
  • -data_source [source]
    The source of the data as a string. The default is "data_import". Set this if you wish to track where data comes from.
  • -data_tags [tag string]
    A space-separated list of tags to associate with the data items. The default is empty (no data item tags). Note that session tags, dataset tags, and data item tags are distinct and separate. It's a common mistake to specify -session_tags or -dataset_tags when -data_tags is intended. See Tags discussion below.
  • -data_categories [categoryA [categoryB ...]]
    List of space-separated categories to assign the item to. The default is empty (no item categories). This is a part of the ACLs feature. Set categories if you want to make use of permissions. You can only set categories that the authenticating user has 'own' permissions on.
  • -disable_encoding_check
    Item file uploads for files of type 'csv, txt and html' have to be strictly UTF-8 encoded. On upload, there is a check for UTF-8 encoding which is used to force this requirement. This check is enabled by default and uploads will fail if this check fails. You can disable this check using -disable_encoding_check if you feel that your file is correctly encoded.
  • -ext_files [ext [ext …]]
    -ext_tables [ext [ext …]]
    -ext_html [ext [ext …]]
    -ext_movies [ext [ext …]]
    -ext_images [ext [ext …]]
    -ext_text [ext [ext …]]
    -ext_scenes [ext [ext …]]
    The -ext_* options are used to map file extensions to a data item type. Each -ext_* option corresponds to a different data item type and specifies the filename extensions that should be identified as such. For example, using -ext_html html htm specifies that *.html and *.htm files should be imported as HTML data items.  By default, *.png and *.jpg map to Image items, *.mp4 maps to Animation items, *.csv maps to Table items, *.stl, *.ply, *.csf, and *.avz map to Scene items, and File items, String items, Tree items, and HTML items have no explicit mapping. If no arguments are given after a -ext_* option, that specifies that no file extensions should be mapped to that data item type.
  • -table_col_labels
    -table_row_labels
    These two options specify that imported tables should consider the first column and first row, respectively, as labels for the table. If either of these options are omitted, it indicates that the column (or row) has no labels in the data file.
  • -table_numeric
    Using this option indicates that table data should be coerced into numeric data types instead of strings (the default).
  • -table_delimiter [char]
    The character that separates cell items from each other in an table input file. By default, the separator is a comma. Specifying nothing for -table_delimiter indicates that no separation should be done and each line is imported as a single cell item.
  • -table_quote_char [char]
    The character that delimits quoted text in cell items in a table input file. By default, the quote character is a single quote ('). This allows for the table delimiter to be used within a single cell. Specifying nothing for -table_quote_char indicates that no special parsing should occur for quoted strings.
  • -verbose
    Enables verbose output from the nexus_data_import tool. When specified, this option will cause the tool to print many extra messages, including data types, files processed, etc.
  • -defaults
    Prints out the defaults used by the nexus_data_import tool and then exits.
  • -simulate
    Causes the nexus_data_import tool to simulate the pushing of data items to a Nexus server without actually doing so. This is helpful when debugging data import, especially when used in conjunction with the -verbose option.


 New in 2020 R2:
 The option -data_categories was added.

Tags

Tags in Nexus are simple strings and can be applied to data items, sessions, and datasets. Tags are not required, but allow a wide variety of expression to define a data model. Without them, queries can be done only with the default data that Nexus provides (dates, names, connections). Best practice involves creating a data model through the judicious use of tags. Depending on the data model desired, the presence or absence of a tag is significant. For instance, one data model might attach significance to a tag "verified" applied to a data item. Such a tag might indicate that the data item has gone through a review process, and the absence of such tag indicates that the process has not been performed. Other tags take the form "key=value". Using this format, tags are more fine-grained, allowing them to take on different values. For instance, it's common to find tags "variable=pressure" or "variable=density" applied to image data that comes from physical simulations. Any number of tags may be specified and are separated by spaces when defined. So the tag string "verified variable=viscosity" is a valid list of two tags, the second one of which has a value.