In order to directly represent documents, Le and Mikolov [2014] introduced Doc2Vec, where every paragraph has a unique vector in the matrix D and every word has its own vector in matrix W (same local context architecture as Word2Vec). These vectors are averaged and combined to predict the next word in the context in a given paragraph.
~
ELSAFTY, Ahmed, 2017. Document Similarity using Dense Vector Representation. pdf , p. 16.
The Paragraph Vector is only shared among words of the same paragraph. It can be represented as another word in the context that is fixed for all sentences and windows in the paragraph. Hence it preserves (or memorize) the topic of the paragraph. That’s where the architecture name got its name "Distributed Memory" shown in figure 2.5.
Figure 2.5: Doc2Vec Distributed Memory model uses the paragraph vector along with the local context words to predict the word w(t), it also acts as a memory of the paragraph’s topic [Le and Mikolov 2014]