To do any 3d scaling like that you need to take into account z position.
when z=1 then the object is at the screen, if z=0 then it's at your eye. Usually you want to destroy/hide the object before reaching the eye.
Consider this example:
https://dl.dropboxusercontent.com/u/542 ... edo3d.capx
I just drew up some simple graphics. I placed the road so it disappears at a vanishing point (320,240).
I then placed the "building" sprites on either side of the road where I'd want them to be if they were at the screen. I then resized them so the edges match the edge of the road.
Looking at just the right "building" it has a position of (556,426) and a size of (161,105).
Next is the math part. We want it to move and scale as the z changes.
Set position to ((556-320)/z+320 ,(426-240)/z+240)
Set size to (161/z, 105/z)
In the above equations 320 and 240 is the vanishing point,
556 and 426 is the object's original position,
and 161 and 105 is the original size.
z is how far into the screen you want the object to be.
Z sorting can be done with something simple like:
for each sprite ordered by sprite.z ascending
--- sprite: move to back
Collision detection can be done by seeing if the z is in range of the other's z:
sprite: on collision with building
building: z > sprite.z-0.01
building: z < sprite.z+0.01
--- destroy sprite