Would it help for somebody who is good at this sort of thing to chip in with how it works in my head? (Well, sort of, it is hard to put into words because it's all images in my head).
I pay attention to what is around me - I notice where the sun is, trees, distinctive features in houses/gardens, where the exits are, if there's a particular plant or something in a building by a pillar, so I know where I am at that moment and what appears to be nearby. When I move away from that position, I note further things in relation to where I had been, as though I've got a map overlay in my head.
When I am going to a place from one direction, I have a memory of all the other familiar places in that overlay and then, I join the pieces up when coming from a different direction, like filling in the blanks and it all gradually expands until I know either everything or the important places and a few 'checks' on the way.
I wouldn't use a Big Sainsbury's as a marker by itself, as there are an awful lot of the things around and they all look very similar - but a big Sainsbury's with a bus stop by the entrance with a Boots opposite and a zebra crossing slightly further on by a big, green building would be a marker point.
To add to that, I've also got an overlay in my head of bus routes and stops/train and tube maps/etc. So I might not know immediately where X coffee shop is, but I know it's in x location, which is on the 123 bus route where it turns right to go past Sainsbury's and there is a blue doctor's surgery coming up to the junction, the 321 bus turns left past the two chemists where the majority of the shops are and there's a good chance it's on the little stretch that's set back from there. Using the mental overlays, I can therefore get to where the coffee shop is likely to be from all 4 directions by bus or by car.
Google Streetview helps - even if the place wasn't actually there at the time of the photos - because it gives a general shape to the road and buildings.
I can also do this long distance, by public transport and in the countryside - finding useful points, being able to visualise them and then place myself in the map overlay.
When I need to explain or learn a route, I use body movements, so if I were to be describing how to get from my house to the bus stop, I'd be using my hands to indicate forward x distance, left, left, across the road, left (two big Magnolia trees and the wall round the corner painted brown), straight(ish), left, cross, straight on, guitar shop opposite, Greek shop opposite, hairdresser on left, Block of flats at junction as you look straight ahead and blue pub to the right of that, bus stop.
DP takes the piss out of me - because homing pigeons and other migratory birds have large quantities of magnetite (a mineral) in their beak area which is believed to be sensitive to the Earth's magnetic field and because humans have lesser amounts of it in their bodies, he says 'Come on, Miss Magnetite, get us home'. He could get lost in the kitchen with a dedicated tour guide at times, so he finds this amusing and very useful - particularly when he called once at 5am Sunday morning to say his lift had broken down and somebody else had dropped him in x place 10 miles away, where did he go from there - I logged into the transport planner, used a combination of that, Google streetview and memory and told him 'Are you somewhere safe? Stay there, I'm working on it' and directed him to the nearest bus stop, which bus to catch, how long it would be, things he would see on the route and where it would drop him off near us.
Sorry for the warble. Tl;dr It's a combination of visual and physical memory and then names of streets/locations are tagged on top.
Everything is interrelated and new places are added on like extra sheets as if I were standing on a map.