What should be compared in statisctics?

You have many epochs of 45s, split in four conditions, and you want to compare the power in frequency band f1 between condition A and condition B, right?

Over 45s, the brain processes are probably not strictly time-locked. If you average them, you will lose a lot of the information that is not time-locked, including most of the higher-frequency oscillations.
If you estimate the PSD of the averaged signals, you would get something with no relevant information above 5Hz. This is not the appropriate way of proceeding.
Whether you want to run an average or a t-test, you should estimate first the PSD of non-averaged recordings, and then average/test it.

You could do either a difference of averages: average(PSD(trialsA)) - average(PSD(trialsB))
Or a t-test between conditions: t-test(PSD(trialsA), PSD(trialsB))
The two would give you similar results, the test mainly adding a significance level to the difference of means.

When using the option of the PSD “Group in frequency bands”, it first calculates the results with the full frequency definition, and then averages by frequency bands.
This grouping is an additional step of processing, that can easily mask other underlying errors. I would recommend you do both (full spectrum AND frequency bands) for at least one subject, and when you get confident that they both give equivalent correct results, repeat it on the other subjects with just the “frequency bands” version.

Francois