SuperGLUE is a new benchmark styled after original GLUE benchmark with a set of more difficult language understanding tasks, improved resources, and a new public leaderboard..
Recent studies have shown that vision transformer (ViT) models can attain better results than most state-of-the-art convolutional neural networks (CNNs) across various image recognition tasks, and can do so while using considerably fewer computational resources. This has led some researchers to propose ViTs could replace CNNs in this field.However, despite their promising performance, ViTs areContinue Reading
In recent years there has been an explosion of methods based on self-attention and in particular Transformers, first in the field of Natural Language Processing and recently also in the field of…
Engineer friends often ask me: Graph Deep Learning sounds great, but are there any big commercial success stories? Is it being deployed in practical applications? Besides the obvious ones–recommendation systems at Pinterest, Alibaba and Twitter–a slightly nuanced success story is the Transformer architecture, which has taken the NLP industry by storm. Through this post, I want to establish links between Graph Neural Networks (GNNs) and Transformers. I’ll talk about the intuitions behind model architectures in the NLP and GNN communities, make connections using equations and figures, and discuss how we could work together to drive progress.
A. Mahapatra, S. Nangi, A. Garimella, and A. N. Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, page 942--951. Abu Dhabi, United Arab Emirates, Association for Computational Linguistics, (December 2022)
G. Sant, G. Gállego, B. Alastruey, and M. Costa jussà. Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Student Research Workshop, page 277--284. Hybrid: Seattle, Washington + Online, Association for Computational Linguistics, (July 2022)
A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly and 2 other author(s). (2020)cite arxiv:2010.11929Comment: Fine-tuning code and pre-trained models are available at https://github.com/google-research/vision_transformer. ICLR camera-ready version with 2 small modifications: 1) Added a discussion of CLS vs GAP classifier in the appendix, 2) Fixed an error in exaFLOPs computation in Figure 5 and Table 6 (relative performance of models is basically not affected).
S. Toshniwal, P. Xia, S. Wiseman, K. Livescu, and K. Gimpel. Proceedings of the Fourth Workshop on Computational Models of Reference, Anaphora and Coreference, page 111--120. Punta Cana, Dominican Republic, Association for Computational Linguistics, (November 2021)
Z. Dai, Z. Yang, Y. Yang, J. Carbonell, Q. Le, and R. Salakhutdinov. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, page 2978--2988. Florence, Italy, Association for Computational Linguistics, (July 2019)