Difference or Weighted Difference

Hello,
I've been using the weighted difference function in Brainstorm to compute ERP contrast waveforms (ΔERN = error − correct) between conditions that have substantially unequal trial counts (e.g., 150 correct vs 15 error responses). My understanding is that this method helps account for the difference in signal reliability due to varying trial numbers, by scaling the contribution of each condition based on effective number of average (Leff).

However, I recently received reviewer criticism labeling this function as a "weird normalization", so I would appreciate some clarification:

  1. What is the rationale behind weighted difference in Brainstorm? Perhaps I got things all wrong.
  2. Is it considered good practice when contrasting ERP conditions with asymmetric trial counts?
  3. In which cases is it recommended or preferred over a simple subtraction (A-B)?
  4. Are there any references or documentation that justify this use?

Thanks a lot for any insights or resources you can share.
Best