Millimeter-wave gas spectroscopy for breath analysis of COPD patients in comparison to GC-MS

The analysis of human breath is a very active area of research, driven by the vision of a fast, easy, and non-invasive tool for medical diagnoses at the point of care. Millimeter-wave gas spectroscopy (MMWGS) is a novel, well-suited technique for this application as it provides high sensitivity, specificity and selectivity. Most of all, it offers the perspective of compact low-cost systems to be used in doctors’ offices or hospitals. In this work, we demonstrate the analysis of breath samples acquired in a medical environment using MMWGS and evaluate validity, reliability, as well as limitations and perspectives of the method. To this end, we investigated 28 duplicate samples from chronic obstructive lung disease patients and compared the results to gas chromatography-mass spectrometry (GC-MS). The quantification of the data was conducted using a calibration-free fit model, which describes the data precisely and delivers absolute quantities. For ethanol, acetone, and acetonitrile, the results agree well with the GC-MS measurements and are as reliable as GC-MS. The duplicate samples deviate from the mean values by only 6% to 18%. Detection limits of MMWGS depend strongly on the molecular species. For example, acetonitrile can be traced down to 1.8 × 10−12 mol by the MMWGS system, which is comparable to the GC-MS system. We observed correlations of abundances between formaldehyde and acetaldehyde as well as between acetonitrile and acetaldehyde, which demonstrates the potential of MMWGS for breath research.


Introduction
The analysis of human breath is a promising tool for medical diagnoses at the point of care because it is non-invasive and convenient for the patient [1][2][3]. Among the many diseases being studied are different forms of cancer [4][5][6], viral infections such as COVID-19 [7,8], and lung-related diseases [9] such as the chronic obstructive lung disease (COPD) [10][11][12]. The physiological information is based on a small fraction of the breath containing endogenous volatile organic compounds (VOCs). Until now, hundreds of different VOCs have been detected in human breath [13][14][15]. One common example is acetone, which alone can be linked to a large variety of conditions and diseases such as diabetes [16].
Among many others, the most widely applied method for the analysis of breath samples is gas chromatography-mass spectrometry (GC-MS) [17][18][19]. Due to its high sensitivity, it is often referred to as the gold standard. Hence, it is well suited to be used for research and to serve as a reference for novel methods. However, it is not feasible to use GC-MS systems widely in doctor's offices or hospitals because they are usually very bulky, expensive as well as time-consuming and complicated to operate. Other common methods include, for instance, ion mobility spectrometry or electrochemical sensors.
Millimeter-wave gas spectroscopy (MMWGS) is a novel approach for breath gas sensing with the potential to contribute new insights to the field. A basic MMWGS setup consists of a transmitter, a gas absorption cell, and a receiver. At certain frequencies, the transmitted radiation excites rotational transitions of the molecules in the gas cell, which is evidenced as an absorption detected by the receiver. Since the basic principle is fundamentally different from most established methods for breath analysis, it comes with certain advantages and complementary characteristics. First of all, the method is considered absolutely specific with regard to the identification of the molecules, because each molecule provides a characteristic fingerprint of transition frequencies [20]. Even isomers and isotopologues of the molecules can be distinguished unambiguously [21]. Secondly, a large variety of molecules can be investigated. Typically, the absorption lines are as narrow as 1 MHz and typical system bandwidths cover around 100 GHz, which allows for the detection of a large variety of different absorption lines from different species without spectral overlap. The method is limited to polar molecules, but it covers a large range of molecular masses. In detail, the sensitivity depends on the particular molecule and the available frequency range. For instance, our system is able to detect water, hydrogen cyanide, formaldehyde, methanol, acetonitrile, acetaldehyde, carbonyl sulfide, or sulfur dioxide with high sensitivity. These molecules cover molar masses from 18.04 g mol −1 to 64.07 g mol −1 , but the method is not limited to this range. With these characteristics, the method can provide valuable complementary information to breath research as molecules might be detected more specifically, more sensitively or exclusively. In addition, the method can provide absolute quantities without the need for calibration.
Another key advantage of the method is the simplicity of the systems which can result in compact low-cost systems. Table-top MMWGS systems have been demonstrated and can be further reduced in size by new concepts for the gas absorption cells which usually dominate the dimensions [22][23][24]. Potential costs for commercial systems will be likely limited by the transmitter and the receiver. With the advance of silicon-germanium bipolar-complementary-metaloxide semiconductor (SiGe BiCMOS) and complementary metal-oxide-semiconductor (CMOS) technology, highly integrated low-cost systems are in sight [25,26].
Due to these developments and the favorable characteristics, there is a potential for small, easy-touse table-top MMWGS systems for breath analysis in the near future. It is difficult to foresee future system properties, but it seems realistic that a commercial MMWGS system could fit in a shoe box at the cost of a few thousand euros. For targeted detections, short measurement times can be realized because the scan of an absorption line typically takes only a few seconds. Due to the low working pressure in the gas absorption cell, it is possible to effectively integrate the sampling into the system. Therefore, the complete process including sampling, measurement, and analysis can be realized on-site. This is also a key feature of compact nitric oxide sensors, which led to the success of analyzing fractional exhaled nitric oxide in human breath [27][28][29]; however, these sensors are limited to NO detection only. Another type of compact sensors are electronic noses, but these suffer from the lack of specificity, i.e. they provide a breath pattern and cannot Identify VOCs [30,31], making the detection of confounding factors very difficult. MMWGS can detect a large variety of molecules specifically, which could also be an advantage in occupational health settings such as monitoring peoples' expositions to VOCs (e.g. formaldehyde or acetonitrile) in a certain working environment [32] or the detection of drug abuse [33]. MMWGS is also capable of providing new information to basic breath research because of the complementary information. Due to the easy operation, it can drive more research, which in turn increases the potential for new knowledge. Most of all, the properties of MMWGS are promising for breath analyzers that could be used in doctors' offices or hospitals.
The general capability of MMWGS for breath gas analysis has been already proven in several studies [34][35][36][37][38]. However, not much research has been done on the validity and reliability of MMWGS for breath analysis in comparison to established methods. This is very important for new methods, because, compared to reference measurements in the laboratory, many additional challenges arise. Factors that influence the measurement outcomes are: the patient preparation, the sampled breath portion, the sampling method and environment, sample storage, involved materials, and data processing to name a few [39][40][41][42]. The commonly used technique of wavelength modulation spectroscopy (WMS) for MMWGS poses an additional challenge for the quantification, since the resulting waveforms are difficult to interpret [43].
In this paper, we aim to validate the results of breath analysis through MMWGS and demonstrate the feasibility of large-scale studies. For this purpose, we investigated 28 duplicate breath samples obtained from 19 Patients in a medical environment and compared it to GC-MS measurements. Sampling and measurements followed a strict protocol based on commonly used equipment and parameters. In order to compare quantities, we developed a method for calibration-free quantification of the WMS signals. By the analysis of duplicate samples, the reliability of the results was also evaluated. The paper concludes with a summary including a critical discussion of advantages and disadvantages of MMWGS for the analysis of exhaled human breath.

Sampling
The samples for this study were taken from a randomly selected subgroup within a large-scale study on exacerbation of COPD patients, the PACE study [44]. In a proof-of-principle study, it was shown that exacerbation is recognizable by exhaled VOC patterns [45]. Running our investigation in parallel with this study provides two major advantages; first, we can prove the feasibility of the MMWGS method in a medical setting by application to patients with a lung-related condition. Second, we can benefit from the study infrastructure, i.e. patient acquisition, sampling, reference measurements and reconditioning of the tubes. This allows for a thorough comparison of the MMWGS method with the wellestablished GC-MS.
Acquisition and sampling of the patients were carried out in the Schön Klinik Berchtesgadener Land in Schönau am Königssee, Germany. We consecutively included patients with COPD (GOLD stage II-IV) referred to an inpatient pulmonary rehabilitation program at the Schön Klinik Berchtesgadener Land. Patients with asthma or asthma-COPD overlap syndrome were excluded from the trial. All subjects gave their written informed consent before they were included into the study. The study has been approved by the ethics committee of the Philipps University of Marburg (No. 61/19) and has been conducted in accordance to the current version of the Declaration of Helsinki (international ICH-GCP E6 guidelines). For this study, we considered samples from a group of 19 different patients. An overview of the patient characteristics is shown in table 1. Each time a patient was sampled, four samples were taken. Seven patients were sampled twice with two to seven days between the two sampling dates and one patient was sampled three times with two and seven days in between. In total, 28 sets of samples were taken. We did not select specific COPD patients and also did not include any control subjects, as we did not intent to relate the VOC data to any clinical parameter at this stage. The number of sample sets was chosen in order to ensure a sufficient range of molecular abundances and to enable comparison of both techniques. For most investigated species, the measured abundances cover roughly one order of magnitude, which is well suited for a comparison between the methods. The samples were taken with the ReCIVA sampler from Owlstone Medical Ltd according to the respective protocol of the study [44,46]. It is a handheld device with integrated pumps, pressure sensor and carbon dioxide sensor to allow for a controlled sampling of VOCs from exhaled breath. Due the sampler's capability of acquiring defined volumes and breath portions, it is well suited for standardized sampling procedures. In this study, the patients inhaled precleaned pressurized room air provided by a SICOLAB

MMWGS sensor system
The MMWGS breath sensor system is based on an absorption spectroscopy setup as described in more detail previously [37]. A schematic of the instrument is shown in figure 1 with the main components being a transmitter, an absorption cell (containing the released VOCs from the sample), and a receiver. The transmitter and receiver are from Virginia Diodes Inc. (VDI). The transmitter generates millimeterwave radiation which is tunable in a frequency range from 220 GHz to 330 GHz with an output power of up to 2 mW. The emitted beam is guided through a circular multipass absorption cell which provides an absorption length of 1.9 m while having a diameter of only 21.5 cm and a height of 8 cm [24]. The transmitted millimeter-wave radiation is detected by a heterodyne receiver which covers the same spectral range as the transmitter. Owing to the compact  multipass absorption cell, the whole setup fits in a 19-inch rack [23]. The Tenax tube with the gas sample is connected to the absorption cell via a valve. A compact, oil-free turbomolecular pump is used for evacuating the absorption cell. In order to measure an absorption spectrum, the frequency of the transmitter is scanned across an absorption line and the transmitted power is detected by the receiver as a function of the transmitter frequency. In order to increase the sensitivity of the instrument and to minimize baseline effects, the method of WMS was applied [48]. In WMS, the frequency of the transmitter is modulated at a fixed frequency and amplitude while scanning across the absorption line. The transmitted signal is detected at the second harmonic of the modulation frequency which results in second derivate-like signal shapes. A photograph of the setup is shown in figure 2.
For the measurements described in a later section of this paper, the transmitter and receiver modules from VDI were exchanged by devices based on the newly developed SiGe BiCMOS technology [25]. The SiGe BiCMOS transmitter and receiver modules have the advantage of being even more compact than the VDI modules while providing similar output power and sensitivity. However, its primary advantage lies in the possibility for better hardware integration and cost-efficient production on a large scale [25].
For this study, we implemented a few modifications to improve the system performance for breath analysis compared to the system reported earlier [37]. First, we used a newly manufactured replica of the gas cell that was thoroughly cleaned and evacuated and not used for any other measurements beforehand. By avoiding unintended contamination and minimizing outgassing of water from the walls of the cell, we prevent broadening of the absorption lines. Second, we minimized the use of plastics in the system to reduce additional contamination due to outgassing induced by the heating process. Finally, we introduced a readout of the direct absorption signal along with the phase-sensitive second harmonic detection to determine the baseline voltage. This improvement is important with respect to the quantification of the results.

Measurement procedure
The measurements with the MMWGS setup were performed according to a strict protocol. First, the gas cell was evacuated down to a pressure of less than 0.01 Pa to avoid cross contamination. Then, the Tenax tube was attached to the cell such that the gas flow into the cell was opposed to the gas flow in the sampling process. Prior to the absorption measurement the tube was heated for thermal desorption. The heating temperature was precisely controlled and set to 250 • C, just as in the GC-MS measurements, because it strongly affects the desorption characteristics. The release of VOCs and water into the gas cell typically increases the total pressure to around 1-2 Pa. Then, the molecules were detected by scanning up to three absorption lines per molecule in the available spectral range by WMS. After that, the cooled Tenax tube was detached. The measurement cycle for one tube (attachment, evacuation, heating, cooling, detachment) took approximately 15 min excluding the measurements which take 5 s per absorption line (50 ms integration time, 10 MHz scan width). The cycle time could be strongly reduced by simple modifications of the setup (such as click mounts or cooling blocks). Along with the measurements of the 56 breath samples, reference measurements have been made following the same procedure, in order to identify any background signal which may contribute to the measured gas mixture. Three reference measurements were performed; without any tube connected to the evacuated gas cell (A), with the evacuated gas cell and an attached stainless-steel tube without Tenax heated to 250 • C (B), and an evacuated gas cell with an attached empty Tenax tube heated to 250 • C (C). The references were measured in exact accordance to the protocol for the breath samples.

Data analysis
The identification of the detected species forms the first step of our data analysis. Successful identification occurs, if all scanned lines of the molecule (three lines in most cases; one in cases with only one available line) have a signal-to-noise-ratio (SNR) greater than one. The strongest line (in terms of the signal amplitude) of each molecule is then selected to quantify the abundance. This is done by analysis of the second harmonic (2f) content of the WMS signal, which resembles the second derivative of an absorption line. However, to interpret the WMS signal in detail, many variables have to be taken into account [43]. Based on the WMS theory [48], we have developed and validated a numerical fit model that allows for absolute and calibration-free quantification of the measurements [49]. A flow diagram of the model is shown in figure 3. Input parameters are the absorption line center frequency (ν 0 ), the molar mass (M), the temperature (T), and the frequency modulation depth (∆ν mod ). Given the strong relation between ∆ν mod and the shape of the measured signal, ∆ν mod was measured precisely beforehand. From the fit result, we obtain the transmission profile, the absorption profile, the partial pressure, and finally the amount of substance (n), i.e. the number of molecules. The normalization is based on the following input parameters: the baseline voltage as determined from the direct absorption signal (U 0 ), the absorption path length (L), the absorption line intensity (S), and the gas cell volume (V Cell ).

GC-MS reference system
Reference measurements were performed at Fraunhofer ITEM in Hannover using a Thermal Desorption (TD) GC-MS system from Perkin Elmer. The sample contents were caught in a cold trap and then desorbed at 250 • C by the TD unit (Turbomatrix ATD350). In the GC unit (Clarus 680), the sample was purged through the column by helium with a flow rate of 10 ml min −1 . Finally, detection was realized by the mass spectrometer system Clarus SQ 8 T. In this study, we analyzed the peak area of the total ion counts of selected masses at the respective retention time, which is not a calibrated measure in terms of absolute quantities. Note that the mass spectrometer can only detect ions with a mass-tocharge-ratio (m/z) above 35.

Detected molecules
In order to establish MMGWS as a method for breath analysis, we will compare our results with those obtained by GC-MS for molecules which are detectable by both techniques. In total, 12 molecules have been detected by MMWGS whereas only five of them were detected by the GC-MS system. The molecular species detected by our system are shown in table 2, most of them being studied earlier with respect to breath analysis [16,32,[50][51][52][53][54][55]. For nine species, three absorption lines were detected and for the other three, only one strong line was available in the frequency range. The sparse spectra in the latter cases are typical only for very simple molecules such as water, hydrogen cyanide, and carbon monoxide. Four molecules, namely carbon monoxide, methanol, acetic acid, and sulfur dioxide, were only detected in a few samples and with an SNR less than five. The five lightest molecules were not detected by GC-MS, because the mass spectrometer of the reference system is not able to detect molecules with a molar mass smaller than 35 g mol −1 . Carbonyl sulfide and sulfur dioxide were not analyzed by the GC-MS reference system. Acetic acid was excluded from the comparison with GC-MS because of the small SNR in the MMWGS measurements. This leaves four species, namely acetonitrile, acetaldehyde, ethanol, and acetone for a comparison between both methods in this study.
A selection of measurements and fits for the molecules listed in table 2 is shown in figure 4. For each molecule, the strongest line (which is used for quantification) is shown from an exemplary sample with a relatively high abundance. As can be seen, the fitting model works well as it represents the data correctly within the noise of the measurement, despite the different linewidths and shapes. Note that the raw signals shown here do not correspond to the molecules' abundances, because for a quantitative analysis, the line strength and other molecule-specific parameters have to be taken into account. Figure 5 presents the abundances of acetonitrile, acetaldehyde, ethanol, and acetone in the samples as measured by MMWGS (blue) and GC-MS (red). The columns represent the mean amounts of substance from each sample pair. The upper and the lower end of the error bars represent the amounts of the two samples of the sample pairs. On the right-hand side, the correlations between the amounts of substance as determined by both methods are shown. The straight grey line in each panel represents a linear fit based on least squares. For ethanol and acetone, the correlations between MMWGS and GC-MS are excellent with coefficients of R = 0.93 and R = 0.83, respectively. Note that the acetone abundance of one sample of the sample pair 33/34 seems to be an irregular outlier in the GC-MS measurement whereas in MMWGS the data are almost identical. The correlation for acetonitrile is less pronounced. However, it is still good (R = 0.58) considering the very low concentrations (roughly 100 times smaller than ethanol or acetone), which make it more challenging to quantify it precisely. These results prove the validity of MMWGS for these compounds very well.

Quantitative results
For acetaldehyde, there is no correlation (R = −0.07) between both methods. To explain the mismatch, it is instructive to consider the variations within the sample pairs, which are very similar for both methods (as discussed below, cf figure 7(b)). Along with the very low concentrations, this might indicate that the variations are intrinsic to the samples and originate (at least partially) from other steps of the breath analysis pipeline, such as sampling, sample storage, or sample handling. Aspects to be considered are e.g. molecular contents in the environment or thermal adsorption characteristics. Acetaldehyde is known to have a very high presence in indoor air (even higher than in breath samples) [56]. In addition, its breakthrough volume at room temperature in Tenax is quite small (0.65 l g −1 ), which can cause it to be flushed out easily. In conclusion, the quantitative results from MMWGS are very consistent for ethanol, acetone, and acetonitrile. In the case of acetaldehyde, the results from both methods are different, but might be determined partially by intrinsic sample variations rather than by the method.
To summarize these results, a comparison between MMWGS and GC-MS is given by Bland-Altman plots in figure 6. Since the GC-MS data were not calibrated to absolute quantities, the values obtained with both methods cannot be compared directly. This required normalizing the data of both methods to the respective means such that the average difference is zero. The values of the normalized differences at 1.96·σ (cf orange lines) are 1.47 (acetonitrile), 1.18 (acetaldehyde), 0.55 (ethanol), and 0.64 (acetone). The good agreement between both methods for ethanol and acetone is clearly recognizable in the plots.
Closely related to the validity of the results are the background concentrations of the molecules in the system, because they can cause misleading results. These background signals have been measured carefully for three different configurations (the gas cell, the heated gas cell, and an empty heated Tenax tube). All four molecules were either not detected in any of the reference measurements or in the case of acetaldehyde only slightly above the noise limit (with an SNR of 1.5) in the empty Tenax tube. On average, the acetaldehyde abundance in the breath samples was roughly seven times stronger than that.
In order to compare the abundances of the four molecules relative to each other, we calculated the mean values of all samples and normalized them to the sum of the four abundances, as shown in figure 7(a). Overall, the results of both methods are similar. However, note that it was not expected that the results agree perfectly, because the GC-MS reference measurements were not calibrated. For the peak areas, only the major but not all masses were used, therefore no absolute level can be derived from this data. This is the main explanation for the discrepancy between these results. An additional reason for the discrepancies is the thermal desorption, which is carried out differently in both setups, i.e. no carrier gas and no cold trap were used in our MMWGS setup. In turn, the desorption directly affects the abundance of the molecules in the measurements. In the MMWGS measurements, the main reason for inaccuracies of the determined quantities might be nonlinearities in the detector response or power saturation of the molecular transition. To give an estimate, we quantified all three absorption lines for each molecule in one exemplary sample. These quantities vary by up to ±20% which is minor compared to the discrepancy to GC-MS. Note, that many parameters affect the MMWGS calibration process, e.g. the line intensity which can vary by several orders Figure 6. Bland-Altman plots for the comparison between MMWGS and GC-MS. Both data sets were normalized to a mean value of one for this representation. The x-and y-axis describe the mean and the difference between both methods, respectively. The orange lines are drawn at 1.96·σ. of magnitude between molecules. Considering all the aforementioned aspects and the fundamentally different operation principles of both methods, the agreement is good. It shows, that the overall relation between the molecular abundances is in the correct order (e.g. the acetonitrile abundance is much smaller than the others). Due the lack of calibration in the GC-MS measurements, a more precise conclusion cannot be drawn from these results. Generally, the results from MMWGS are likely favorable in this regard as they deliver absolute quantities by principle, which is an advantage over other methods. It should be pointed out, that most studies on exhaled breath are based on relative abundances between patients rather than absolute quantities. However, an accurate quantification can help to compare different studies. In addition, the knowledge of exact quantities can support understanding the underlying physiological processes [57].

Reliability
For the purpose of medical studies, it is very important that the method of analysis is reliable, i.e. two measurements of the same sample should deliver the same result. To evaluate the reliability of both methods separately, we investigated how strongly each pair of measurements of the duplicate samples varies.
We determined the difference of each sample to the respective mean for the sample pair and averaged it over all samples. This value was normalized through division by the mean abundance of all samples and is shown in figure 7(b). It turns out that these numbers are very similar for both techniques or even slightly better for MMWGS. The values range between 6% and 18% for MMWGS and between 7% and 19% for GC-MS. In addition, we determined the correlation between sample 1 and 2 of each sample set. . We can conclude that the results from MMWGS are very reliable and at least as reliable as the results obtained with the GC-MS reference system. This is promising for medical studies, in particular when considering common variations in patients with similar physiological conditions and the statistics from a large number of patients.

Limit of detection (LOD)
The LOD is given by the noise of the system which is 100 nV root mean square (RMS) on average. From the SNR of the measurements (cf figure 4) and the noise Figure 8. Limit of detection (dark purple) and limitations due to background signals (light purple) of the MMWGS system. The noise limits differ by orders of magnitude (note the logarithmic scale), mainly due to the different absorption line intensities. The GC-MS detection limit is indicated by the red bars for those molecules which are detectable by the GC-MS reference system. limit, we can extrapolate the LOD. As a threshold, we considered SNR = 1, which is still recognizable because the lines span several spectral bins. These limits are shown in figure 8 for the different species. It varies by three orders of magnitude (note the logarithmic scale), basically determined by the line intensity, but also affected by the baseline signal at the center frequency and the linewidth. In some cases, absorption lines can be weak and the LOD is higher than in GC-MS. On the other hand, a number of molecules in this study provide very strong transitions, which result in detection limits as low as in GC-MS, for example for acetonitrile. When considering the corresponding mass detection limits, we find a mass as low as 10 pg of hydrogen cyanide can be detected by our system. For some molecules, we have observed absorption lines also in the reference measurements, i.e. the molecules are present in the system without any sample attached. The background levels (the highest from A and B) are indicated by light purple columns and represent practical detection limits of the current system. As expected, the background signal of water, a common contaminant in vacuum systems, is particularly high. It should be noted that the background signals can be further reduced by an improved vacuum system. The contributions from the Tenax tubes to the background were measured in C. These were observed for water (5.8 × 10 −7 mol), formaldehyde (7.5 × 10 −11 mol), and acetaldehyde (4.8 × 10 −11 mol). Here, to allow for a comparison with GC-MS, we consider a threshold of 10 000 units of the peak area as LOD for the reference GC-MS system, according to practical experience. This corresponds to an LOD of around 10 −12 mol (cf red bars in figure 8). Note that some molecules (e.g. water, hydrogen cyanide, formaldehyde) were not analyzed by the reference GC-MS system.

Breath patterns of the patients
After demonstrating validity and reliability of MMWGS for breath analysis, we now turn to the breath patterns of the investigated 19 patients. First, we compare the abundances of the eight molecules which were detected clearly in all patients by MMWGS (cf table 2). The mean amounts of substance for all patients are shown in figure 9 (purple columns).
The first thing to note is the high abundance of water. From the calculated partial pressures and the measured total pressures in our gas cell, we can derive that the measured gas mixtures consist of 60%-70% water. Despite the significant contribution of water from background in the system (the abundance of water in B is 3.4 × 10 −7 mol), the measurements provide relevant information about the breath samples' water contents on top (the abundances in the samples range from 3.4 × 10 −7 mol to 1.2 × 10 −6 mol). The correlation between the duplicate samples is R = 0.76. Monitoring the water content of the samples might be useful for normalization strategies [50]. Following water, acetone and ethanol have the highest abundances with mean values of 2.6 × 10 −9 mol and 1.8 × 10 −9 mol, respectively. This is only about 0.3%-0.4% of the amount of water (note the logarithmic scale). The other compounds were measured in even lower concentrations, lower by up to two more orders of magnitude.
The variations of the abundances between patients are illustrated by the white dots in figure 9.
One can see that the variations of the acetonitrile and ethanol abundances are quite large (coefficient of variation CV = 0.88 and CV = 0.73, respectively), whereas abundances of formaldehyde (CV = 0.18) and carbonyl sulfide (CV = 0.19) are very similar for all patients.   In order to study any possible correlations between the molecular abundances, we determined the correlation coefficients for each pair of molecules (see table 3). We have found good correlations between formaldehyde and acetaldehyde (R = 0.61) as well as between acetonitrile and acetaldehyde (R = 0.66), see figure 10. These molecules have been shown to be related to active or passive smoking [58][59][60][61]. All investigated patients are former smokers, except for two patients who are still actively smoking (cf table 1). The correlation coefficients of the other molecules are less than or equal to 0.42 in the absolute value.
We have also investigated the molecular abundances in the breath samples of patients, who have been sampled two or three times with several days between subsequent sample takes. This is shown for acetonitrile for the respective eight patients in figure 11. It is remarkable, that the abundance of acetonitrile for each patient does not change much (by 25% on average) over several days compared to the large variation (ranging within a factor of seven) between different patients. In particular, one patient has a very high content of acetonitrile at both times whereas two others reveal very low concentrations. The patient with the second highest acetonitrile Figure 12. Comparison of MMWGS measurements with the commercial system from VDI (purple) and a system based on SiGe BiCMOS technology (green). The results are very similar which proves capability of the SiGe BiCMOS system for highly integrated low-cost breath gas sensors. abundance in figure 11 is one of the two active smokers in this study.

Cost-efficient easy-to-use MMWGS system
A potential application of an MMWGS breath gas sensor requires an easy-to-use and affordable system as discussed in the introduction. Considering this, we have analyzed our MMWGS system with respect to cost and performance. The main cost drivers are the transmitter and receiver. Alternatives are recently developed transmitter and receiver modules fabricated in SiGe BiCMOS technology. We have implemented a SiGe BiCMOS transmitter and receiver developed at the Leibniz-Institut für Innovative Mikroelektronik (IHP) in our MMWGS breath gas sensor and demonstrated its performance. The SiGe BiC-MOS transmitter and receiver are attached each on a 10 × 10 cm 2 printed circuit board which contains all electronics. At this stage, the boards contain a lot of functionality for basic investigations which are not required for a practical application scenario. Due to the silicon-based technology, the circuits can be integrated in very compact modules in the scale of a few centimeters. The modules replace the commercial devices from VDI (∼3 × 3 × 10 cm 3 each) which require additional synthesizers (∼25 × 20 × 5 cm 3 each). For an arbitrary chosen sample (no. 42), we performed an additional measurement with the SiGe BiCMOS system immediately after the measurements with the commercial system. We investigated an absorption line of acetone at 249.805 GHz. Both systems operated at similar settings to allow for a comparison. For technical reasons, the integration time of the SiGe BiCMOS system is limited to 8 ms. In order to be comparable with the 50 ms integration time of the commercial system, we averaged six measurements corresponding to an integration time of 48 ms. The baseline voltages as measured by the direct signals were 39.5 mV and 35 mV for the VDI system and the SiGe BiCMOS system, respectively.
The results exhibit a very similar SNR (VDI: 46, SiGe: 40), see figure 12, which demonstrates feasibility of the system for studying breath gas samples.

Conclusion
We have studied breath gas from 2 × 28 samples from COPD patients with our novel MMWGS system. Selected absorption lines from a set of 12 molecules were measured and analyzed (eight of them in more detail). The abundances of the molecules in the samples were determined from the absorption spectra using our automatic calibration-free fit model. The results of four molecules were compared to the wellestablished GC-MS method. The results obtained with MMWGS and GC-MS agree very well for ethanol and acetone and well for acetonitrile. These encouraging results prove the validity of MMWGS for breath analysis. For acetaldehyde, there is no correlation between both methods. In this case, the results are likely affected by variations in the sampling process, since the reliability of both methods is similar. For MMWGS, the quantities from duplicate samples deviate from the respective mean values by 6% (acetone) to 18% (acetaldehyde) on average. Even quantities of the molecules relative to each other agree well between both methods, considering the fundamentally different principles of operation and data analysis methods. To derive accurate molecular quantities, it is a major advantage that MMWGS systems are calibration-free and highly specific. Even very similar molecules, such as isotopologues and isomers can be distinguished unambiguously. Since the signals in MMWGS depend strongly on absorption line intensities, the detection limits are different for each molecule. For acetonitrile, it is similar to GC-MS in this study. By improving the noise performance and the bandwidth of the system, its sensitivity and selectivity can be further enhanced. An extension of the absorption path length (L) of the gas cell can increase the sensitivity as well, because the absorption signal scales approximately linearly with L. It should be noted that this study was limited to a small number of molecules in order to facilitate a thorough comparison with GC-MS. In a previous investigation, a total of 21 molecules have been detected in the breath of healthy humans [37]. Generally, MMWGS is limited to polar molecules, because only these provide strong absorptions, but covers a large range of molecular masses. Typically, our system works with molar masses around 20 g mol −1 -60 g mol −1 . One limitation for the selectivity are incomplete data bases (in particular for many larger molecules), which need to be extended in the future. To demonstrate the potential for clinical studies, we have investigated correlations between the different molecules' abundances. We found noticeable correlations between formaldehyde and acetaldehyde (R = 0.61) as well as between acetonitrile and acetaldehyde (R = 0.66). In the future, more molecules relevant for breath analysis as well as correlations of clinical variables will be investigated.
In conclusion, we demonstrated that MMWGS provides reliable and valid results for the analysis of human breath in comparison to GC-MS, the goldstandard, and is well-suited for large-scale clinical studies. The main advantageous features of MMWGS are the high sensitivity, specificity, and selectivity as well as the ability to provide absolute quantities without calibration. These characteristics can provide new insights into breath research as certain molecules can be detected exclusively, more specifically or in lower concentrations compared to other methods. Hence, MMWGS can be a valuable complement to the field of human breath analysis. Furthermore, due to the simple principle of operation, the compact system size, potentially low costs as well as fast and easy handling and data analysis, MMWGS offers the prospective of a wide use in doctor's offices or hospitals. In the future, fully integrated, even more compact lowcost systems can be developed thanks to the advance of SiGe BiCMOS and CMOS technology.

Data availability statement
The data that support the findings of this study are available upon reasonable request from the authors.