In order to achieve the goal of human using natural language to instruct robots to perform spatial tasks,based on the "human-centered" design principle,the present study analyzed the cognitive mechanism and process of human-human spatial natural language interaction and then put forward a human-intelligent-isomorphal scheme of robot spatial cognition.The robot is anticipated to be as intelligent as humans,who can take the spatial perspective of others,visually recognize spatial frames of reference,understand and produce natural language in any kind of spatial frames of reference,and employ theory of mind to judge others’ intention and adjust interaction strategy according to others’ feedback.