Skip to contents

Make a csv file with the specifications in the Access register database and implement them to the raw data of the selected group of files ie. (filgruppe). All files under the selected group will be affected unless the KOBLID with argument koblid is specified or select argument is used. Specifying koblid or select is useful especially for testing purposes.

This function is the most used function in KHelse for processing raw data. The function lag_fil() is an alias to make_file().


  group = NULL,
  koblid = NULL,
  aggregate = NULL,
  save = FALSE,
  year.geo = NULL,
  implicitnull = NULL,
  row = NULL,
  base = NULL,
  parallel = deprecated(),
  raw = NULL,
  select = NULL

  group = NULL,
  koblid = NULL,
  aggregate = NULL,
  save = FALSE,
  year.geo = NULL,
  implicitnull = NULL,
  row = NULL,
  base = NULL,
  parallel = deprecated(),
  raw = NULL,
  select = NULL

  group = NULL,
  koblid = NULL,
  aggregate = NULL,
  save = FALSE,
  year.geo = NULL,
  implicitnull = NULL,
  row = NULL,
  base = NULL,
  parallel = deprecated(),
  raw = NULL,
  select = NULL



The name of filegroup as specified in filgruppe


KOBLID from table tbl_Koble


Logical value. Default is TRUE. Aggregate data according to the specification in registration database. Global options with orgdata.aggregate.


Logical value. Default is FALSE. To save as .csv format file by activating save_file() function.


Which reference year to use for geograhical coding. If it is missing then global option for orgdata.year will be used.


Logical value. Default is TRUE to add implicit null to the dataset. Global options with orgdata.implicit.null.


Select specific row(s) numbers only. Useful for debugging. Please read Debugging article for detail.


Logical value. If TRUE then use year in the original data as the base year to recode the geographical codes. Default is FALSE and use all available codes in geo codebook


Logical or numeric value. With logical value TRUE it will run with parallel using 50% ie. 0.5 of local cores. User can decide other percentage if needed. For example to use 75% of the cores then specify as parallel = 0.75. Nevertheless, maximum cores allowed is only 80%. Default value is FALSE ie. to use sequential processing


Logical value. Default is FALSE as in config. If TRUE then read original raw data directly from source file even if the dataset is already available in DuckDB without the need to unmark KONTROLLERT in the Access database


Select number of valid files to process as an alternative to using KOBLID. To select the first 5 files then write select=1:5. Use select="last" to select the last or most recent file.

See also

Other filegroups functions: make_filegroups()


if (FALSE) { # \dontrun{
dt <- make_file("ENPERSON")
dt <- make_file("ENPERSON", raw = TRUE) #Skip DuckDB and read directly from original files
dt <- make_file("ENPERSON", koblid = 120:125) #Select specific files only
dt <- make_file("ENPERSON", select = "last") #Select most recent file
} # }