read_csv

DataFrame.read_csv(fnames, append=False, quotechar='"', sep=',', num_lines_read=0, skip=0, colnames=None, time_formats=None, verbose=True) DataFrame[source]

Read CSV files.

It is assumed that the first line of each CSV file contains a header with the column names.

Args:
fnames (List[str]):

CSV file paths to be read.

append (bool, optional):

If a data frame object holding the same name is already present in the getML, should the content of of the CSV files in fnames be appended or replace the existing data?

quotechar (str, optional):

The character used to wrap strings.

sep (str, optional):

The separator used for separating fields.

num_lines_read (int, optional):

Number of lines read from each file. Set to 0 to read in the entire file.

skip (int, optional):

Number of lines to skip at the beginning of each file.

colnames(List[str] or None, optional):

The first line of a CSV file usually contains the column names. When this is not the case, you need to explicitly pass them.

time_formats (List[str], optional):

The list of formats tried when parsing time stamps.

The formats are allowed to contain the following special characters:

  • %w - abbreviated weekday (Mon, Tue, …)

  • %W - full weekday (Monday, Tuesday, …)

  • %b - abbreviated month (Jan, Feb, …)

  • %B - full month (January, February, …)

  • %d - zero-padded day of month (01 .. 31)

  • %e - day of month (1 .. 31)

  • %f - space-padded day of month ( 1 .. 31)

  • %m - zero-padded month (01 .. 12)

  • %n - month (1 .. 12)

  • %o - space-padded month ( 1 .. 12)

  • %y - year without century (70)

  • %Y - year with century (1970)

  • %H - hour (00 .. 23)

  • %h - hour (00 .. 12)

  • %a - am/pm

  • %A - AM/PM

  • %M - minute (00 .. 59)

  • %S - second (00 .. 59)

  • %s - seconds and microseconds (equivalent to %S.%F)

  • %i - millisecond (000 .. 999)

  • %c - centisecond (0 .. 9)

  • %F - fractional seconds/microseconds (000000 - 999999)

  • %z - time zone differential in ISO 8601 format (Z or +NN.NN)

  • %Z - time zone differential in RFC format (GMT or +NNNN)

  • %% - percent sign

verbose (bool, optional):

If True, when fnames are urls, the filenames are printed to stdout during the download.

Returns:
DataFrame:

Handler of the underlying data.