I mean, I know one way is using a crawler that loads a number of known pages and attempts to follow all its listed links, or at least the ones that lead to different top level domains, which is how I believe most engines started off
But how would you find your way out of “bubbles”? Let’s say that, following all the links from the sites you started off, none point to abc.xyz. How could you discover that site otherwise?
You’re search engine would have to be told about that site some other way.
I’m not sure if you can anymore, but at least years ago you could register your site with Google that way it could find it without other links to your site being present.
Yea, it’s still possible to register your site via google.