Shared Transformer Encoder with Mask-Based 3d Model Estimation for Container Mass Estimation