Make a csv
file with the specifications in the Access register
database and implement them to the raw data of the selected group of files
ie. (filgruppe). All files under the selected group will be affected
unless the KOBLID
with argument koblid
is specified or
select
argument is used. Specifying koblid
or select
is useful especially for testing purposes.
This function is the most used function in KHelse for processing
raw data. The function lag_fil()
is an alias to make_file()
.
Usage
make_file(
group = NULL,
koblid = NULL,
aggregate = NULL,
save = FALSE,
year.geo = NULL,
implicitnull = NULL,
row = NULL,
base = NULL,
parallel = deprecated(),
raw = NULL,
select = NULL
)
lag_fil(
group = NULL,
koblid = NULL,
aggregate = NULL,
save = FALSE,
year.geo = NULL,
implicitnull = NULL,
row = NULL,
base = NULL,
parallel = deprecated(),
raw = NULL,
select = NULL
)
mf(
group = NULL,
koblid = NULL,
aggregate = NULL,
save = FALSE,
year.geo = NULL,
implicitnull = NULL,
row = NULL,
base = NULL,
parallel = deprecated(),
raw = NULL,
select = NULL
)
Arguments
- group
The name of filegroup as specified in filgruppe
- koblid
KOBLID
from table tbl_Koble- aggregate
Logical value. Default is
TRUE
. Aggregate data according to the specification in registration database. Global options withorgdata.aggregate
.- save
Logical value. Default is
FALSE
. To save as.csv
format file by activatingsave_file()
function.- year.geo
Which reference year to use for geograhical coding. If it is missing then global option for
orgdata.year
will be used.- implicitnull
Logical value. Default is
TRUE
to add implicit null to the dataset. Global options withorgdata.implicit.null
.- row
Select specific row(s) numbers only. Useful for debugging. Please read Debugging article for detail.
- base
Logical value. If
TRUE
then use year in the original data as the base year to recode the geographical codes. Default isFALSE
and use all available codes in geo codebook- parallel
Logical or numeric value. With logical value
TRUE
it will run with parallel using 50% ie. 0.5 of local cores. User can decide other percentage if needed. For example to use 75% of the cores then specify asparallel = 0.75
. Nevertheless, maximum cores allowed is only 80%. Default value isFALSE
ie. to use sequential processing- raw
Logical value. Default is
FALSE
as in config. IfTRUE
then read original raw data directly from source file even if the dataset is already available in DuckDB without the need to unmarkKONTROLLERT
in the Access database- select
Select number of valid files to process as an alternative to using
KOBLID
. To select the first 5 files then writeselect=1:5
. Useselect="last"
to select the last or most recent file.
See also
Other filegroups functions:
make_filegroups()
Examples
if (FALSE) { # \dontrun{
dt <- make_file("ENPERSON")
dt <- make_file("ENPERSON", raw = TRUE) #Skip DuckDB and read directly from original files
dt <- make_file("ENPERSON", koblid = 120:125) #Select specific files only
dt <- make_file("ENPERSON", select = "last") #Select most recent file
} # }