after source reconstruction, I use AAL atlas to extract time series from the motor cortex
If you are using the entire precentral gyrus from AAL, you need to be aware that this region is very large and may include a lot of different sources. Averaging such large regions can be detrimental to the analysis. (I'm not saying you should not use it this way, just that it could be good to test this approach against smaller/more focal motor ROIs, at least on a few pilot subjects).
Since subjects in their subject space after the inverse solution is different from template space(correct me if I am wrong), I was thinking about how I can extract the volume per subject, the solution to me seems to create the group grid so that I can compute the head model in line with the template.
This is one approach, suggested in the tutorials. Indeed, if you're using a grid of sources that is matching point-to-point with across all the subjects and the template, then you can define your ROI (=volume scout) at the group level, project it to the individual space of each subject, and then run the analysis in subject space. You could even directly average/compare the entire source maps across subjects. Note that with a group grid, you still need to recompute the forward model for each subject separately.
https://neuroimage.usc.edu/brainstorm/Tutorials/CoregisterSubjects#Volume_source_models
Later on, I saw the tutorial in section Volume_atlases, it says
Volumes in MNI space can be imported and transformed to the subject space.
. Does that mean I can just import atlas(MNI space) then it would be warp to my subject space? In this way, I do not need to create a group grid for my head model computation, right?
This is correct too. Once you have a non-linear MNI transformation computed for the T1 MRI of your subject (using SPM12 or CAT: https://neuroimage.usc.edu/brainstorm/CoordinateSystems#Non-linear_normalization), you can import any MNI-based parcellation into this subject:
https://neuroimage.usc.edu/brainstorm/Tutorials/DefaultAnatomy#MNI_parcellations
The two solutions should give very similar results, as they rely on the same mechanism: the MNI normalization available for the subject. In the first case the source grid is transformed from MNI space to subject space, in the second case an entire volume parcellation is transformed from MNI space to subject space using the same transformation.
However, I think the first option (the group grid) provides more capabilities for additional analyses (comparing directly the entire source maps) with almost no extra work.
But maybe there is some extra complexity that I don't picture precisely at the moment?
The best solution for you to gain confidence in your results is maybe be to compare the two approaches for a few pilot subjects and observe that the differences are marginal. If they are not, there is an issue somewhere.