Try not to pick sports where there are a lot of "unknown" international matches, historically.
yeah, i'm always afraid of this. most sports are now sufficiently documented as far as main tournaments go. however, friendlies are a completely different matter. even in basketball friendlies are not very well documented. football, rugby, and cricket are the only sports i've seen that document friendlies well. i just started handball yesterday and the only thing i'm having trouble finding are early results for the African championship. in most sports tournaments that are not considered main tournaments can also present a problem for research. without www.theroonba.com , www.todor66.com and www.wikipedia.org i would be screwed. my list of essential sources is getting pretty extensive: https://docs.zoho.com/writer/open/1vl7i972fbd149897431685118db97a9a5f40
Friendlies are not really necessary - and in most cases can complicate things, as teams use squads of variable strength. In international tournaments, teams will more often than not use their best available players.
I include friendly matches in sports where there are lots (football, futsal and beach soccer, ice hockey, rugby union and cricket) and where they are relatively important. There are also a fair number in basketball, handball, field hockey, volleyball, water polo - but results of these are poorly documented. And where they are documented (archives on live score sites), their inclusion is arbitrary - probably fewer than 50% of all friendly matches make it onto these sites. For these sports, I concentrated only on world tournaments, continental tournaments, regional tournaments and tournaments that are part of multi-sport events (again world, continental and regional). I also went back only to the year 2000 for most of these sports. I find this is sufficient to produce meaningful comparisons (and also, I am fairly confident I have collected between 95-100% of all the results from these tournaments, whereas if I'd tried to go back beyond 2000, that number would have decreased significantly). The reason I've only gone back to 2010 on the website is due to lack of time to piss about and format all the results and put in all the venues, dates, etc. It took me long enough just to go back to 2010...
My spreadsheets of results were started just to have an ordered list of results for ranking purposes, and therefore I didn't include dates, as I didn't need them.