Hi,
I’m wondering if we are able to get any information about the size of the test dataset that will be used on our final submission of the pricing model. As we have 16 million rows in total (which is huge), around how many rows can we expect to be included in the final test set? Will that be extremely huge like the full dataset as well, or a piece of it, or from some other dataset? This can determine the design of our functions or even the model to prevent exceptions like timeout error.
Thanks a lot!