<< И снова, как я строил ПЭВМ | Home | Zune review >>

Defended programming

Where to validate data?

Such approach is usually based on a set different assertions and checking of data. However there are no clear rules when an assertion or check has to be done. From some sources it should be done as soon as data reached the system. Reason of that is isolation system from processing incorrect data and providing the fastest notification about a problem. However I have a different point of view. Data validation has to happen only at point of direct data processing. It gives obvious benefits. Lets consider the following code:

data =  getInput();

checkvalidity(data);

 pass_to_process1(data);

 

process1:

checkvalidity(data);

do_some_stuff();

pass_to_process2(data);

 

process2:

   checkvalidity(data);

   do_some_stuff();

   pass_to_finalprocess(data);

 

finalprocess:

   checkvalidity(data);

   process(data);

 

You can notice that any intermediate data processing can validate data, however is that good?  It can look like, because it eliminates extra working with bad data. However most of time system deals with good data, and only in below 1% with bad. So validation data on every step issues significant overhead. Another problem is that a final processing data can change validation criteria. Let’s say it could process string only in ASCII, or only shorter than 1024. Now the method is improved and the limitations are released, however all upper level code is still checking for old limitation.

Conclusion: do data validation only at data processing and use exception mechanism to notify top-level requester.

 




Add a comment Send a TrackBack