The unit used to measure memory is a page. There are a finite number of slots available for active memory pages, and usually the computing device doesn’t know which pages to use to fill these slots. Demand paging works through page faults; a page fault occurs when the CPU requests a page, and it is not in one of the active memory slots. The requested will then be fetched from the storage, and will then continue to reside in one of the active slots.
As this process continues, through various page faults, all the slots will eventually get filled. This algorithm works extremely well if these are the pages required the most frequently. The rate of page faults decrease as the pages are found with the active memory itself, reducing retrieval time considerably. If a page fault occurs, one of the pages is removed and the new one is inserted in its place.