I found the NReal image tracking to be quite sketchy atm. Its still not very consistent.
I think you need some massive trackers for the initial recognition, also, you need to make sure that the image is very well lit, and not obscured, to properly recognize the image.
Does the position of the image/performer change through the performance? Otherwise it might be easier to add images to the seats, and place the particles based on the location of the seat.
Perhaps use manually position the particles on the stage during the performance (or use say a VIVE tracker), and use multiplayer APIs to sync the position to the headsets.