Test Set for final evaluation of Complex Pricing Model

Hi,


I’m wondering if we are able to get any information about the size of the test dataset that will be used on our final submission of the pricing model. As we have 16 million rows in total (which is huge), around how many rows can we expect to be included in the final test set? Will that be extremely huge like the full dataset as well, or a piece of it, or from some other dataset? This can determine the design of our functions or even the model to prevent exceptions like timeout error.


Thanks a lot!

I am not too sure of I understand why you need this information. You should not make any assumption on the test set. Your functions should be as general as possible and should work with the same input as given.


Thanks.