groundy
models & research

Can Deep Learning Design RF Power Amplifiers Without Full EM Simulation?

A Chalmers/Tampere paper trains a CNN on EM-simulated layouts to search Doherty amplifier combiners in milliseconds. EM simulation is amortized, not eliminated.

8 min···5 sources ↓

Deep learning does not remove electromagnetic simulation from RF power-amplifier design. A new paper from a Chalmers/Tampere group trains a convolutional network on EM-simulated layouts and uses it as a fast surrogate inside a genetic-algorithm search, but an EM solver still generated every one of those samples and still validates the final design. Deep learning here compresses the iteration loop for engineers who already know their targets; it does not make the EM step optional.

How does the paper’s pipeline actually work?

The inverted Doherty paper (arXiv:2606.27002, submitted 25 June 2026) frames PA combiner design as a supervised-learning problem. Each candidate combiner is a pixelated layout of metal and substrate, and that grid is the input to a convolutional neural network whose output is the predicted S-parameters of the structure.

The training set is a body of candidate layouts, each generated by a full-wave EM solve. Once trained, the network predicts S-parameters far faster than a full-wave solve, which is the only reason a search over the combiner space becomes tractable. The design space of a pixelated combiner is enormous, far beyond brute-force EM enumeration.

A genetic algorithm then does the actual searching, driving the impedance at the main and auxiliary transistor current-source planes toward targets set by a dual-state impedance synthesis method that addresses both peak and back-off power conditions. The CNN only sits in the inner loop of that search, scoring candidate layouts; it never touches silicon directly.

Is EM simulation really eliminated?

No. The paper frames the CNN as an EM surrogate, but the substitution is inside the GA loop, not across the pipeline. EM simulation does two jobs the CNN cannot: it generates the training data, and it checks whether the CNN’s predictions hold on the final design.

This distinction changes who the method is for. A team that already runs an EM solver routinely can fold a body of simulations into a one-time training set and then explore the design space cheaply. A team with no EM infrastructure gains nothing, because it cannot produce the training set in the first place. The CNN is a productivity multiplier on top of an existing EM workflow, not a substitute for having one.

The generalization footprint reinforces this. The trained network is narrow: one band (1.9, 2.5 GHz), one transistor family (GaN HEMT). The paper does not demonstrate transfer across processes or bands. Move to a different foundry, a different dielectric, or a different frequency plan and the training set has to be regenerated. That is consistent with how surrogate modeling works in photonics and antenna inverse design: the model is cheap to query, expensive to retrain, and bounded by the geometry of its training distribution.

What did the prototype actually measure?

The fabricated inverted Doherty PA is a GaN HEMT device with a pixelated output combiner. Across 1.9, 2.5 GHz it delivered 51, 63% peak drain efficiency, 48, 54% efficiency at 6-dB back-off, and 44 ± 0.3 dBm saturated output power.

With digital predistortion applied, the adjacent-channel leakage ratio is better than −53.2 dBc. That DPD step is worth flagging. The headline ACLR figure is a post-correction result, and the PA needs predistortion to be usable, like nearly every other Doherty in the literature.

The peak-efficiency numbers are not record-setting for the class: the group’s own earlier black-box Doherty exceeded 74% drain efficiency at its target band, and the inverted design’s 51, 63% is a deliberate trade of peak efficiency for bandwidth. The inverted Doherty’s selling point is sustained efficiency across a fractional band at compact size, not absolute peak drain efficiency.

How does the inverted design compare to the group’s other Doherty papers?

This is the third paper in a trilogy, and reading them in sequence exposes the actual engineering trade the group made. All three come from Han Zhou, Haojie Chang, David Widén, and Christian Fager, split between Chalmers University of Technology and Tampere University, and the inverted topology was chosen because the earlier two designs ran out of bandwidth.

The black-box Doherty (arXiv:2603.16565, March 2026) reached more than 74% maximum drain efficiency and above 52% at 9-dB back-off at roughly 2.75 GHz, with better than −60.8 dBc ACLR after DPD on a 20-MHz 5G NR-like waveform. That is the high-water mark on raw efficiency, but it is effectively a spot-frequency design. The conventional Doherty (arXiv:2606.18395, June 2026) widened things slightly to 2.6, 2.8 GHz with more than 71.2% peak and 64% at 6-dB back-off, using the same CNN-plus-GA inverse-design stack with dual-state impedance synthesis.

The inverted design in this paper trades peak efficiency for bandwidth: 51, 63% peak is lower than both predecessors, but it holds across 1.9, 2.5 GHz rather than collapsing outside a narrow window. The paper attributes the move to the inverted topology’s greater bandwidth potential, after the black-box and conventional combiners showed degraded bandwidth. This is a coherent research arc that maps directly onto the classic Doherty tradeoff curve between back-off depth, peak efficiency, and fractional bandwidth.

PaperTopologyBandPeak drain ηBack-off ηACLR (post-DPD)
2603.16565Black-box Doherty~2.75 GHz>74%>52% @ 9 dBbetter than −60.8 dBc
2606.18395Conventional Doherty2.6–2.8 GHz>71.2%64% @ 6 dBbetter than −51.3 dBc
2606.27002Inverted Doherty1.9–2.5 GHz51–63%48–54% @ 6 dBbetter than −53.2 dBc

One caveat on that table: all three papers are from the same authorship group. The trilogy is a single lab building up a method, not independent replication. The results are consistent with each other and with the underlying Doherty theory, which is reassuring, but they are not yet corroborated by a second group.

What does an RF engineer actually gain?

The win is iteration speed, not expertise displacement. The expensive part of combiner design is the EM solve, and the CNN moves that cost out of the inner loop of a search and into a one-time training set. An engineer who has already chosen the transistor, fixed the substrate, set the bias points, and determined the load-pull targets can let the GA explore layouts at CNN speed and reserve full-wave EM for final verification. That is a real reduction in design-cycle time, and it is the part most likely to survive into commercial EDA tooling.

What it does not do is remove the need for the engineer. The targets the GA optimizes toward (the optimal load impedance, the back-off impedance ratio, the frequency band) are all set by someone who already understands the device. The dual-state impedance synthesis that frames the search is itself a load-pull-derived method. The CNN fails silently outside its training distribution, and the paper offers no mechanism for detecting that failure beyond re-running EM on suspicious candidates. A practitioner who treats the surrogate’s output as ground truth, rather than as a fast first-pass estimate to be checked, will ship a bad design faster than they would have without it.

The deeper pattern here is not specific to RF. Surrogate models that amortize a slow physics solver into a training set are the same trick being applied in photonic inverse design, metasurface optimization, and antenna synthesis. In every case the headline “machine learning replaces simulation” resolves, on inspection, into “machine learning makes simulation cheaper per candidate by paying for it once.” For RF power amplifiers the Chalmers trilogy is the most concrete worked example so far: a genuine methodological advance for teams that already hold an EM license and a load-pull file, and a non-event for everyone else.

Frequently Asked Questions

How does the inverted Doherty’s training set compare to the group’s earlier conventional Doherty CNN?

The inverted paper trained on 5,000 pixelated layouts on a 15-by-15 grid, augmented fourfold by flipping and rotation. The earlier conventional Doherty paper used roughly 75,000 circuits augmented to 600,000, with a deeper network: 12 convolutional blocks of 32 filters plus 6 fully connected layers of 2,048 neurons, trained over 300 epochs with Adam at lr=0.001 and mean-absolute-error loss.

What does a team need to reproduce this design pipeline?

An ADS Momentum license to generate the 5,000 EM-simulated training layouts, GPU time to train the 12-layer CNN, and the target transistor’s load-pull data. The fabricated prototype used two 10-W Macom GaN HEMT devices on a 78-by-72 mm, 20-mil Rogers 4350B board at an optimal load impedance of 30 ohms, so the same substrate and device family are required for the training distribution to hold.

What happens when the genetic algorithm proposes a layout outside the CNN’s training distribution?

The CNN returns a sub-millisecond S-parameter prediction with no confidence signal, and the only check is a full-wave EM re-solve of the final chosen layout. The deeper risk is not the fabricated prototype but the GA’s intermediate population: a single out-of-distribution candidate can bias the search toward a wrong region of the combiner space, and the loop has no mechanism to detect that drift before final EM verification.

Is the reported ACLR of -53.2 dBc achievable without digital predistortion?

No. The raw ACLR before DPD was -28.6 dBc on a 40-MHz, 7-dB-PAPR OFDM signal, and the -53.2 dBc headline is a post-correction figure. The small-signal gain exceeds 10 dB across 1.8 to 2.6 GHz, but deployment still requires a DPD stage paired with the PA, matching the group’s black-box (-60.8 dBc post-DPD) and conventional (-51.3 dBc) results.

How many electromagnetic simulations does this method actually avoid?

The combiner’s pixel grid is 15 by 15 with binary metal/substrate pixels, giving 2^225 possible patterns. The CNN’s sub-millisecond S-parameter predictions let the GA sample that space densely, but the amortization is bounded by training cost: only 5,000 EM runs went into the training set, so accuracy across the rest of the 2^225 space depends entirely on how well those 5,000 layouts cover the geometry distribution.

sources · 5 cited