MergeTextFilesWithSD.pl - Merge CSV or TSV TextFile(s) into SDFile
MergeTextFilesWithSD.pl SDFile TextFile(s)...
MergeTextFilesWithSD.pl [-h, --help] [--indelim comma | semicolon] [-c, --columns colnum,...;... | collabel,...;...] [-k, --keys colkeynum;... | colkeylabel;...] [-m, --mode colnum | collabel] [-o, --overwrite] [-r, --root rootname] [-s, --sdkey sdfieldname] [-w, --workingdir dirname] SDFile TextFile(s)...
Merge multiple CSV or TSV TextFile(s) into SDFile. Unless -k --keys option is used, data rows from all TextFile(s) are added to SDFile in a sequential order, and the number of compounds in SDFile is used to determine how many rows of data are added from TextFile(s).
Multiple TextFile(s) names are separated by spaces. The valid file extensions are .csv and .tsv for comma/semicolon and tab delimited text files respectively. All other file names are ignored. All the text files in a current directory can be specified by *.csv, *.tsv, or the current directory name. The --indelim option determines the format of TextFile(s). Any file which doesn't correspond to the format indicated by --indelim option is ignored.
Print this help message.
Input delimiter for CSV TextFile(s). Possible values: comma or semicolon. Default value: comma. For TSV files, this option is ignored and tab is used as a delimiter.
This value is mode specific. It is a list of columns to merge into SDFile specified by column numbers or labels for each text file delimited by ";". All TextFile(s) are merged into SDFile.
Default value: all;all;.... By default, all columns from TextFile(s) are merged into SDFile.
For colnum mode, input value format is: colnum,...;colnum,...;.... Example:
For collabel mode, input value format is: collabel,...;collabel,...;.... Example:
This value is mode specific. It specifies column keys to use for merging TextFile(s) into SDFile. The column keys, delimited by ";", are specified by column numbers or labels for TextFile(s).
By default, data rows from TextFile(s) are merged into SDFile in the order they appear.
For colnum mode, input value format is:colkeynum, colkeynum;.... Example:
For collabel mode, input value format is:colkeylabel, colkeylabel;.... Example:
Specify how to merge TextFile(s) into SDFile: using column numbers or column labels. Possible values: colnum or collabel. Default value: colnum.
Overwrite existing files.
New SD file name is generated using the root: <Root>.sdf. Default file name: <InitialSDFileName>MergedWith<FirstTextFileName>1To<Count>.sdf.
SDFile data field name used as a key to merge data from TextFile(s). By default, data rows from TextFile(s) are merged into SDFile in the order they appear.
Location of working directory. Default: current directory.
To merge Sample1.csv and Sample2.csv into Sample.sdf and generate NewSample.sdf, type:
To merge all Sample*.tsv into Sample.sdf and generate NewSample.sdf file, type:
To merge column numbers "1,2" and "3,4,5" from Sample2.csv and Sample3.csv into Sample.sdf and to generate NewSample.sdf, type:
To merge column "Mol_ID,Formula,MolWeight" and "Mol_ID,ChemBankID,NAME" from Sample1.csv and Sample2.csv into Sample.sdf using "Mol_ID" as SD and column keys to generate NewSample.sdf, type:
ExtractFromSDFiles.pl, FilterSDFiles.pl, InfoSDFiles.pl, JoinSDFiles.pl, JoinTextFiles.pl, MergeTextFiles.pl, ModifyTextFilesFormat.pl, SplitSDFiles.pl, SplitTextFiles.pl
Copyright (C) 2024 Manish Sud. All rights reserved.
This file is part of MayaChemTools.
MayaChemTools is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 3 of the License, or (at your option) any later version.