Discovery Labs (DSCO) has developed Surfaxin (lucinactant) for the prevention of respiratory distress syndrome (RDS) in premature infants. Surfaxin has an expected PDUFA decision date of 06 MAR 2012. Here, we analyze the key issue for Surfaxin’s latest NDA: demonstrating bioequivalence of the phase III trial formulation and the currently manufactured product.
Most DSCO followers consider improving and validating the fetal rabbit biological activity test (FRBAT) to be the primary focus of DSCO’s regulatory efforts over the last two years. That’s true. Here, we reframe FRBAT as a bioequivalence issue with important implications for FDA approval requirements.
Let’s begin by examining an excerpt from a 2009 8-K describing DSCO’s meeting with the FDA following the previous CRL (emphasis our own):
During Surfaxin’s Phase 3 clinical trials, a leading academic neonatologist assessed the biological activity of the clinical batches by measuring respiratory compliance in a well-established pre-term lamb model of RDS. After completing Surfaxin’s Phase 3 clinical trials, in accordance with discussions with the FDA, Discovery Labs validated and implemented the BAT as a recurring quality control test to confirm biological activity for Surfaxin release and stability testing. Based on agreements reached in meetings with the FDA in 2006 and 2008, Discovery Labs conducted a series of preclinical experiments to establish comparability between Surfaxin drug product used in Phase 3 clinical trials and the Surfaxin drug product intended to be manufactured for commercial use. Accordingly, Discovery Labs initiated a series of side-by-side studies employing both the pre-term lamb model of RDS and the BAT and believes that the correlated results demonstrate comparability and support approval of Surfaxin.
At the recent June 2 meeting, Discovery Labs presented a compilation of previously-submitted data from the pre-term lamb model and BAT studies, together with a comprehensive statistical evaluation of such data, intended to establish to the satisfaction of the FDA comparability of clinical drug product to Surfaxin drug product to be manufactured for commercial use. The comprehensive statistical evaluation was a comparative regression analysis using an accepted FDA statistical method. Discovery Labs believes that the data and related statistical evaluation that it submitted to the FDA are highly supportive of the comparability of clinical drug product to commercial Surfaxin.
The FDA stated, for the first time, that the 2006 and 2008 agreement with Discovery Labs to establish comparability through these studies is unprecedented and the determination of whether Discovery Labs has adequately established comparability is solely within the FDA’s discretion. The FDA now insists, for the first time, that data generated from the pre-term lamb model and BAT studies must demonstrate, in a point-to-point analysis, the same relative changes in respiratory compliance between both models over time. Based on this newly-defined standard, the FDA indicated that to adequately establish comparability in this manner would be an extremely high hurdle and that, from the FDA’s perspective, the data analysis provided by Discovery Labs did not meet that standard.
The FDA suggested that the comparability studies in the pre-term lamb model and the BAT would not be necessary if the BAT had been implemented to assess Surfaxin drug product used in the Phase 3 clinical trials. Additionally, the FDA suggested that, to increase the likelihood of gaining Surfaxin approval and as an alternative to demonstrating comparability using the pre-term lamb model and BAT, Discovery Labs could consider conducting a limited clinical trial employing only the BAT as a path forward to Surfaxin approval.
Slide 17 of DSCO’s Feb 2012 corporate presentation lists the requirements for validation of FRBAT. We reiterate an important bullet point below:
Therefore DSCO needed to link pre-term lamb results of biological activity of Surfaxin used in Phase 3 trial with FRBAT results of biological activity of currently manufactured Surfaxin; this also links Surfaxin used in the phase 3 trial to currently manufactured Surfaxin.
When a company changes a drug’s formulation, FDA requires the company to demonstrate bioequivalence. DSCO changed Surfaxin’s formulation, so what DSCO is calling “comparability” is in fact bioequivalence.
Here’s the tricky part: well-defined bioequivalence guidance exists for orally available drugs based on established pharmcokinetic (PK) models. However, no formal guidance exists (JAMP) for demonstrating bioequivalence of inhaled drugs because blood concentration PK parameters are not informative of pharmacodynamics (PD) when the drug is delivered directly to the lung and the lung is the site-of-action. Instead, one way to demonstrate bioequivalence of an inhaled drug is to show consistency of efficacy outcomes in a clinical trial, which is why FDA suggested DSCO pursue a limited clinical trial.
A surfactant is similar to an inhaled drug because it is delivered directly to the lungs via lavage. However, DSCO cannot conduct a human clinical trial to demonstrate bioequivalence because:
- Bioequivalence tests are typically conducted in healthy adult volunteers. It is not possible to lavage the lungs of healthy adults who are not surfactant deficient.
- A bioequivalence trial would need to be conducted in premature infants. The company claims in its 01 FEB 2011 Surfaxin Update, starting at 29:30, that the FDA agrees with its position that it is unethical to conduct a clinical trial in premature babies for the sole purpose of demonstrating bioequivalence. It’s possible this is simply management spin since we don’t know exactly what FDA said in the back and forth discussion. But for the purposes of this post, let’s assume FDA indeed reversed its position regarding the previous suggestion of a clinical trial.
In summary, DSCO needs to demonstrate bioequivalence between the two Surfaxin formulations. Surfaxin’s PD characteristics are similar to inhaled drugs. No formal bioequivalence guidance exists for inhaled drugs and DSCO cannot conduct something similar to past inhaled bioequivalence clinical trials in humans due to safety and ethical considerations. Thus, DSCO’s bioequivalance situation is unprecedented for the FDA and DSCO is in uncharted regulatory waters. DSCO is attempting to convince FDA of bioequivalence and link the two formulations together by showing agreement between two validated animal models.
Why did DSCO receive the last CRL?
We believe FDA rejected DSCO’s previous “correlated results” in 2009 because statistical research has shown that the naive approach of showing point-to-point concordance based on a Pearson correlation coefficient is flawed. See Measurement in Medicine: The Analysis of Method Comparison Studies and Statistical methods for assessing agreement between two methods of clinical measurement for a detailed discussion of why Pearson correlation is not suitable for method comparison.
We guess DSCO’s “comparative regression analysis” was similar to Finney’s method (see here and here). As described in the JAMP paper above, Finney’s method has problems when the dose-response curve (or time-response curve in DSCO’s case) is relatively flat. FDA has trouble evaluating Finney’s method in such cases (see: corticosteroids), which is probably why FDA did not accept the regression analysis. Further, simply comparing the slopes of two lines as DSCO mentions in its Surfaxin update presentations is not rigorous enough to demonstrate concordance.
We therefore believe FDA is looking for the point-to-point analysis method proposed in the papers above: a Bland-Altman plot. The Bland-Altman plot has become the standard for testing assay agreement, which is exactly what DSCO is attempting to show between the rabbit and lamb assays.
Assay Variance as a Limiting Factor
Below we highlight a quote from the “Statistical methods…” paper that shows the extent of DSCO’s challenge:
Repeatability is relevant to the study of method comparison because the repeatabilities of two methods of measurement limit the amount of agreement which is possible. If one method has poor repeatability – i.e. there is considerable variation in repeated measurements on the same subject – the agreement between the two methods is bound to be poor too. When the old method is the more variable one, even a new method which is perfect will not agree with it. If both methods have poor repeatability, the problem is even worse.
It is not possible to repeatedly test a surfactant in a single pre-term animal, so we can’t assess the repeatability of each model directly. But we know from DSCO’s Surfaxin update conference call on 28 SEP 2011 (fast forward to 29:00) that whole-animal biological activity tests such as the pre-term rabbit and pre-term lamb are highly variable, with coefficients of variation of up to 0.7. DSCO’s statistical challenge is to show that two independent, high-variance animal models can agree.
How much variance does each animal model have?
DSCO claims they can’t show real data in their Surfaxin presentations because the data is proprietary. Fortunately for us, pre-term rabbit and lamb models have been used for a long time for similar surfactants. DSCO has also previously published lucinactant data in a pre-term lamb model.
For pre-term lambs, variance can be extracted from the figure below (from: Pulmonary Distribution of Lucinactant and Poractant Alfa and Their Peridosing Hemodynamic Effects in a Preterm Lamb Model of Respiratory Distress Syndrome). Note: this is direct Surfaxin data. The figure shows pre-term lamb response to Lucinactant (Surfaxin, black dots) as a percent change in compliance. The error bars are standard error of the mean (SEM = standard deviation / sqrt(n)). Reading from the graph, mean = 237.5, SEM = 62.5, n=6. The SD = 153.09, which means the coefficient of variation of the percent change in compliance for the lamb model is approximately 0.65.
For pre-term rabbits, variance can be extracted from the figure below (from: Treatment Responses to Surfactants Containing Natural Surfactant Proteins in Preterm Rabbits). Note: this is *not* Surfaxin data, but this data is illustrative of how rabbits respond to SP-B and Surfaxin is a synthetic peptide of SP-B. The figure shows how pre-term rabbits respond to SP-B in terms of compliance alone. We need to convert compliance to a percent change of compliance over control to make it comparable with the figure above. Reading from the graph: control mean = 0.355, SEM 0.011, n = 15. SP-B mean = 0.822, SEM = 0.044, n=16. Converting this to percent change, the mean percent change is SP-B/Control = 2.31, which is on par with the lamb data. Using standard deviation rules, we calculate the percent change SD = 0.57, which means the coefficient of variation for the rabbit model is 0.25. This figure seems reasonable since the group who conducted this study have published many papers using pre-term rabbit models and DSCO was able to reduce the FRBAT variability by 40%.
Bioequivalence in the Bland-Altman Plot
Here is our assumption regarding DSCO’s proposed experimental protocol:
- Manufacture multiple lots of Surfaxin for each shelf-life time point
- Administer each lot to a paired pre-term rabbit and pre-term lamb. Each pair gets Surfaxin from the same lot.
- Measure the percent change in pulmonary compliance over baseline for the pre-term rabbit and lamb.
- Generate a Bland-Altman plot from paired data to show that the rabbit and lamb have “the same relative changes in respiratory compliance between both models over time.”
Regarding orally inhaled bioequivalence, the JAMP article states, “conventional bioequivalence studies are found acceptable if the 90% CIs comparing the test drug and reference drug products fall within the acceptable range of 80-125%.” This range is extended to 67-150% for pharmacodynamic bioequivalence limits used for the approval of generic albuterol CFC MDIs in the 1990s. We will assume a more liberal 67-150% range for assessing the bioequivalence of the rabbit and lamb models.
Translating this range into a Bland-Altman context, the 90% confidence interval of the difference between the paired rabbit and lamb percent change in compliance needs to fall within 67-150% of the mean lamb percent change to demonstrate bioequivalance (0.33 of the mean on the downside, 0.5 of the mean on the upside). The SD of the difference between the rabbit and the lamb = 163. The z-score for the 90% confidence interval is 1.645. Since the rabbit and lamb seem have approximately the same mean, the difference has mean = 0 with a 90% confidence interval of (-268,268).
-268 is well below the lower end of the 67% limit of -0.33 * 237.5 = -78.375. Likewise, 268 is above the upper end of the 150% limit of .5 * 237.5 = 118.75. DSCO will be unable to satisfy the PD bioequivalence requirements previously specified for inhaled albuterol. Even bringing the lamb coefficient of variation down to rabbit levels of 0.25 will not satisfy the upper limit. DSCO would have to decrease the standard deviation of each assay by 55% from the levels mentioned above to meet the albuterol criteria on the more lenient upper end.
Note: our confidence interval analysis is not perfect and needs some correction because it is not asymmetric from a ratio perspective, which is why we are going with the lenient upper bound. Although this analysis is assumes a Bland-Altman comparison, we feel the high variability of the rabbit and lamb models precludes showing rigorous agreement with any method.
The high variance inherent in whole-animal biological activity test assays will prevent DSCO from meeting established inhaled albuterol bioequivalence standards when assay agreement is examined in a Bland-Altman plot. This is purely a statistical problem due to the nature of the assay – DSCO’s next option moving forward is to conduct a clinical trial using the new formulation and demonstrate consistency in the clinical outcome as the FDA originally requested.
We have no doubt Surfaxin works. We would approve it based on common sense because it appears to be effective and safe from the clinical trials. If DSCO had stuck with their phase III formulation, they probably would have been approved a long time ago. But because DSCO changed the formulation, they have to conduct an unprecedented bioequivalance trial for a surfactant delivered by lung lavage. DSCO entered uncharted regulatory territory, where they became stuck between a rock and a statistical hard place – a nightmare scenario for any company seeking drug approval.
The comments have asked for a source regarding the formulation change. Quoting from the 2009 8-K I linked to, “Based on agreements reached in meetings with the FDA in 2006 and 2008, Discovery Labs conducted a series of preclinical experiments to establish comparability between Surfaxin drug product used in Phase 3 clinical trials and the Surfaxin drug product intended to be manufactured for commercial use.” This is mentioned in all press releases and presentations – that they need to link the clinical product with the manufactured product.
The only reason to do that is if the formulations are different.
Further, in the FDA guidance on bioequivalence I linked in the article it says:
BE documentation can be useful during the IND or NDA period to establish links between (1) early and late clinical trial formulations; (2) formulations used in clinical trial and stability studies, if different; (3) clinical trial formulations and to-be-marketed drug product; and (4) other comparisons, as appropriate.
Case 3 is DSCO’s situation – the clinical trial formulation and manufactured to-be-marketed drug are different, clearly making this a matter of bioequivalence. So when DSCO says “link” they are couching it in the language of the bioequivalence guidance.
Here’s an interesting twist I originally missed. This press release from 2010 says, “The FDA has also acknowledged that Discovery Labs had successfully demonstrated in the preterm lamb model the comparability of Surfaxin clinical drug product to the to-be-marketed Surfaxin drug product.”
Let’s assume DSCO successfully demonstrated bioequivalence between the clinical and marketed version of Surfaxin via a lamb/lamb comparison and follow this train of thought:
- In my calculations of the published pre-clinical data, the lamb had a coefficient of variation (CV) of 0.65.
- Since DSCO showed bioequivalence, they were able to show that two lamb assays, each with a CV of 0.65, showed acceptable agreement, either through Bland-Altman or another method.
- Further, in my calculations of other groups working on preterm rabbits, the rabbit CV was 0.25. Let’s assume 0.25 is the best you can do, making DSCO’s rabbit CV 0.42 prior to the 40% improvement.
- The rabbit has a much lower CV than the lamb. So if they were able to demonstrate comparability with lamb/lamb 0.65/0.65, they should have been able to demonstrate comparability between rabbit/lamb at 0.42/0.65 in 2009. But they weren’t able to do that. And it wasn’t even close – on first inspection the company and FDA both thought they would have to do a limited PD trial because the statistics wouldn’t hold up.
- These relative CV trends hold up in the “representative” data in the DSCO presentation, where the lamb is more variable (larger error bars) than the rabbit.
So what gives?
First, how was DSCO able to demonstrate bioequivalence in lamb/lamb with such high CV? Those values are too high to work in a Bland-Altman assay agreement context.
If the lamb data was good enough for bioequivalence, then the rabbit should have been good enough as well since it has a lower CV. From the 2010 press release: “The letter focused primarily on certain aspects of the BAT, specifically whether preclinical data generated using both the BAT and a well-established preterm lamb model of RDS adequately supports the comparability of Surfaxin clinical drug product to the to-be-manufactured Surfaxin drug product, and whether the BAT can adequately distinguish change in Surfaxin biological activity over time.” So they clearly attempted to show both both FRBAT and the pre-term lamb support comparability of clinical and manufactured drug.
If they’ve already demonstrated comparability between the clinical and manufactured versions, then there’s no need to continue saying “this also links Surfaxin used in the phase 3 trial to currently manufactured Surfaxin.” on slide 17 of the corporate presentation. It should be enough to simply show the lamb and rabbit are the same for the purposes of this complete response.
So to be fair in my caveats, something is unresolved in the data I pulled (either they had much better lamb data than was published or the original rabbit data was really bad), my model is incorrect, or the whole story is not being revealed since we only have the company’s side of the situation.