Differences
This shows you the differences between two versions of the page.
phelp:helpxcat [2021/12/29 16:21] |
phelp:helpxcat [2021/12/29 16:21] (current) |
||
---|---|---|---|
Line 1: | Line 1: | ||
+ | X-CAT -- Concatenate XML files | ||
+ | |||
+ | Because of the special nature of XML Files, they cannot be concatenated using a typical concatenate utility. | ||
+ | |||
+ | X-Cat attempts to perform an xml-aware concatenation on the selected files. | ||
+ | |||
+ | |||
+ | BASIC CONCATENATION STEPS | ||
+ | |||
+ | The basic steps to concatenating XML files are the same as any other concatenation process. | ||
+ | |||
+ | First, choose the files that you want to concatenate, | ||
+ | |||
+ | The first time you run the program, the dialogs that select files will open in your My Documents folder. After that, however, these file selection dialogs will open in the last selected folder. | ||
+ | |||
+ | There are two ways to select files: using an explorer dialog, or setting filename patterns. | ||
+ | |||
+ | SELECT FILES WITH EXPLORER | ||
+ | |||
+ | To select files with explore, click on the ' | ||
+ | |||
+ | To launch an explorer dialog, click anywhere in the open space of this tab. You may then navigate to folders and select files (to select multiple files at the same time, hold down the < | ||
+ | |||
+ | Alternately, | ||
+ | |||
+ | Note that every file in this window will have a checkbox next to it. If you uncheck a file, it will be exccluded from the concatenation processing. Click the ' | ||
+ | |||
+ | SELECT FILES WITH A PATTERN | ||
+ | |||
+ | To select files using a filename pattern, click on the ' | ||
+ | |||
+ | *.xml | ||
+ | |||
+ | If you do not specify a path in your pattern, then the program will look for the files in the same folder as the xcat.exe (which probably is not what you want). To change this behavior, enter a path below (in the box labelled 'Set a relative path' | ||
+ | |||
+ | Once the patterns have been entered, click the ' | ||
+ | |||
+ | If you want to select files from multiple folders, it may be necessary to enter a fully qualified path for each pattern. | ||
+ | |||
+ | RESULTS FILE | ||
+ | |||
+ | Once the file selections have been completed, return to the main tab (' | ||
+ | |||
+ | xcat-results-YYMMDDnn.xml | ||
+ | |||
+ | where YYMMDD will be the current date, and ' | ||
+ | |||
+ | You may use the default filename, or change it to whatever you wish. | ||
+ | |||
+ | RUN | ||
+ | |||
+ | Once the file selection is complete, and the results filename has been specified, press ' | ||
+ | |||
+ | When you press ' | ||
+ | |||
+ | RECORD COUNTS | ||
+ | |||
+ | There are no ' | ||
+ | |||
+ | DUPLICATE FILES | ||
+ | |||
+ | Duplicate files are not excluded by default. For example, if you select a file more than once, it will be added to the Source Files list, but its checkbox will not be selected. If you want these files to be added more than once, you will have to manually check them. | ||
+ | |||
+ | XML DOCUMENT PROCESSING DETAILS | ||
+ | |||
+ | This program attempts to process all of the files specified as XML documents. | ||
+ | |||
+ | The first task is to test-load each file. This step validates the XML structure and extracts the document element. Any file that fails this step will be removed from the list of files specified. | ||
+ | |||
+ | The second task is to check the document elements extracted by the first step. For best results, they should all be the same. If they are not, the program will display a warning, listing the variant document elements that it found. If you override this warning, the program will continue to concatenate the files, even though it is possible the results will be corrupt. | ||
+ | |||
+ | Once this pre-processing is taken care of, the program begins to concatenate the files. | ||
+ | |||
+ | In general, for each file, the program collects all children of the document element (or root node). | ||
+ | |||
+ | For example, in a MARCXML or MODS file, the root node might be '< | ||
+ | |||
+ | Or, in an OAI file, the root node might be '< | ||
+ | |||
+ | The first file in the file list is treated slightly differently than the rest. The first file is itself loaded into the resulting XML document; from there the program makes a list of its top-level elements. Then, each subsequent file is loaded into a scratch XML document, and the program searches each file for top-level elements that match the list created from the first file. Each matching element is then appended to the resulting XML document. | ||
+ | |||
+ | In the event that different types of documents are being concatenated, | ||
+ | |||
+ | There' | ||
+ | |||
+ | File 1: | ||
+ | < | ||
+ | < | ||
+ | < | ||
+ | < | ||
+ | |||
+ | File 2: | ||
+ | < | ||
+ | < | ||
+ | < | ||
+ | < | ||
+ | |||
+ | and so on. | ||
+ | |||
+ | Using the logic described above, the resulting XML document will look like this: | ||
+ | |||
+ | < | ||
+ | < | ||
+ | < | ||
+ | < | ||
+ | < | ||
+ | < | ||
+ | < | ||
+ | ... | ||
+ | |||
+ | </ | ||
+ | | ||
+ | There' | ||
+ | |||
+ | However, its possible to eliminate the extra < | ||
+ | |||
+ | This 'XML Options' | ||
+ | |||
+ | |||
+ | |||
+ | |||