This chapter introduces the framework for exploring emoji-text relations in social media that is used in this book. The chapter begins by explaining the discourse semantic systems that have been developed in Systemic Functional Linguistics for describing ideational, interpersonal, and textual meaning. This is in order to lay the foundation for exploring the linguistic meanings with which emoji coordinate in subsequent chapters. The chapter then introduces the concept of ‘intermodal convergence’ used in social semiotics to describe how semiotic modes such as language and images coordinate to make meaning. The chapter outlines the principles that we use for determining emoji-text convergence, including proximity, minimum mapping, and prosodic correspondence. It concludes with an overview of the system of emoji-text convergence, presenting the system network guiding the close textual analysis conducted on the social media corpora used in the book.