Apache ORC support

Applies to: Dataedo 23.x (current) versions, Article available also for: 10.x

Dataedo 9.3 added support for ORC files. Dataedo scans ORC file and builds a structure that includes:

Supported Metadata

Metadata

  • Primitive types (boolean, byte, int, long, float, string ...)
  • Compound types:
    • Struct
    • List
    • Map
    • Union

Each field contains:

  • Name,
  • Data type,
  • Nullabiltiy.

Data profiling

Datedo does not support data profiling in ORC files.

How to import ORC File

To import ORC file to Dataedo:

  • right click on any database or Structures folder, choose Add Object, then Add/Import Structure, or
  • on main ribbon select Add Object then Structure/File, or
  • select Structures folder and on main ribbon select Add Structure/File.

Then select Import from file and ORC format. To read the file, point to a ORC file on the disk and click Next. This will scan the content and open Structure designer with a parsed structure. You can use this window to edit names, data types and field types and save with Save button.

ORC Structure designer

Guide: Adding files to the catalog

Found issue with this article? Comment below
Comments are only visible when the visitor has consented to statistics cookies. To see and add comments please accept statistics cookies.
0
There are no comments. Click here to write the first comment.