tag :: transformer | BibSonomy

bookmarks (hide)32
display
all
bookmarks only
bookmarks per page
5
10
20
50
100
sort by
added at
title
RSS
BibTeX
XML

1Mamba Explained | Kola Ayonrinde
https://www.kolaayonrinde.com/blog/2024/02/11/mamba.html
3 months ago by @bshanks
show all tags
llm
mamba
ssm
transformer
llmmambassmtransformer
(0)
copydelete
- community post
- history of this post
1atmorep.org
https://www.atmorep.org/#presentations
8 months ago by @annakrause
show all tags
atmorep
climate
deeplearning
idea:remoformer
transformer
atmorepclimatedeeplearningidea:remoformertransformer
(0)
copydelete
- community post
- history of this post
1The Illustrated Transformer – Jay Alammar – Visualizing machine learning one concept at a time.
https://jalammar.github.io/illustrated-transformer/
11 months ago by @bsc
show all tags
neural-networks
transformer
neural-networkstransformer
(0)
copydelete
- community post
- history of this post
3Google "We Have No Moat, And Neither Does OpenAI"
https://www.semianalysis.com/p/google-we-have-no-moat-and-neither
a year ago by @annakrause
show all tags
llm
training
transformer
llmtrainingtransformer
(0)
copydelete
- community post
- history of this post
1SuperGLUE Benchmark
SuperGLUE is a new benchmark styled after original GLUE benchmark with a set of more difficult language understanding tasks, improved resources, and a new public leaderboard..
a year ago by @annakrause
show all tags
idea:bee_audio_llm
idea:big_data_geo_2
superglue
transformer
idea:bee_audio_llmidea:big_data_geo_2supergluetransformer
(0)
copydelete
- community post
- history of this post
2Just A Drop In The Bucket
https://milk.com/wall-o-shame/bucket.html
a year ago by @bshanks
show all tags
restaurant
transformer
restauranttransformer
(0)
copydelete
- community post
- history of this post
1NeurIPS_ML4PS_2022_28.pdf
https://ml4physicalsciences.github.io/2022/files/NeurIPS_ML4PS_2022_28.pdf
a year ago by @annakrause
show all tags
climate
idea:big_data_geo_2
todo:read
transformer
climateidea:big_data_geo_2todo:readtransformer
(0)
copydelete
- community post
- history of this post
1Transformers, Explained: Understand the Model Behind GPT-3, BERT, and T5
A quick intro to Transformers, a new neural network transforming SOTA in machine learning.
2 years ago by @analyst
show all tags
article
blog
deep-learning
transformer
articleblogdeep-learningtransformer
(0)
copydelete
- community post
- history of this post
1Researchers Glimpse How AI Gets So Good at Language Processing | Quanta Magazine
https://www.quantamagazine.org/researchers-glimpse-how-ai-gets-so-good-at-language-processing-20220414/
2 years ago by @bshanks
show all tags
attention
neuralnet
toread+
transformer
variable
attentionneuralnettoread+transformervariable
(0)
copydelete
- community post
- history of this post
1Facebook & UC Berkeley Substitute a Convolutional Stem to Dramatically Boost Vision Transformers’ Optimization Stability | Synced
Recent studies have shown that vision transformer (ViT) models can attain better results than most state-of-the-art convolutional neural networks (CNNs) across various image recognition tasks, and can do so while using considerably fewer computational resources. This has led some researchers to propose ViTs could replace CNNs in this field.However, despite their promising performance, ViTs areContinue Reading
3 years ago by @analyst
show all tags
article
blog
computer-vision
machine-learning
transformer
articleblogcomputer-visionmachine-learningtransformer
(0)
copydelete
- community post
- history of this post
1Is attention what you really need in Transformers? | by Davide Coccomini | Jun, 2021 | Towards Data Science
In recent years there has been an explosion of methods based on self-attention and in particular Transformers, first in the field of Natural Language Processing and recently also in the field of…
3 years ago by @analyst
show all tags
article
blog
transformer
articleblogtransformer
(0)
copydelete
- community post
- history of this post
1Instructions on Transformer for people outside NLP field, but with examples of NLP – Data Science Blog
https://data-science-blog.com/blog/2020/12/30/transformer/
3 years ago by @analyst
show all tags
2020
article
blog
deep-learning
transformer
tutorial
2020articleblogdeep-learningtransformertutorial
(0)
copydelete
- community post
- history of this post
1End-to-End Video Instance Segmentation With Transformers
https://openaccess.thecvf.com/content/CVPR2021/html/Wang_End-to-End_Video_Instance_Segmentation_With_Transformers_CVPR_2021_paper.html
3 years ago by @shuncheng.wu
show all tags
cvpr21
deeplearning
instance_segmentation
transformer
video
cvpr21deeplearninginstance_segmentationtransformervideo
(0)
copydelete
- community post
- history of this post
1MaX-DeepLab: End-to-End Panoptic Segmentation With Mask Transformers
https://openaccess.thecvf.com/content/CVPR2021/papers/Wang_MaX-DeepLab_End-to-End_Panoptic_Segmentation_With_Mask_Transformers_CVPR_2021_paper.pdf
3 years ago by @shuncheng.wu
show all tags
cvpr21
deeplearning
instance
panoptic
segmentation
transformer
cvpr21deeplearninginstancepanopticsegmentationtransformer
(0)
copydelete
- community post
- history of this post
1Google Replaces BERT Self-Attention with Fourier Transform: 92% Accuracy, 7 Times Faster on GPUs | Synced
https://syncedreview.com/2021/05/14/deepmind-podracer-tpu-based-rl-frameworks-deliver-exceptional-performance-at-low-cost-19/amp/
3 years ago by @annakrause
show all tags
BERT
GPU
transformer
BERTGPUtransformer
(0)
copydelete
- community post
- history of this post
1A Visual Guide to Using BERT for the First Time – Jay Alammar – Visualizing machine learning one concept at a time.
https://jalammar.github.io/a-visual-guide-to-using-bert-for-the-first-time/
3 years ago by @becker
show all tags
bert
visual
guide
example
neural
network
natural
language
processing
nlp
transformer
bertvisualguideexampleneuralnetworknaturallanguageprocessingnlptransformer
(0)
copydelete
- community post
- history of this post
1deepset-ai/FARM: Fast & easy transfer learning for NLP. Harvesting language models for the industry.
:house_with_garden: Fast & easy transfer learning for NLP. Harvesting language models for the industry. - deepset-ai/FARM
4 years ago by @nosebrain
show all tags
farm
library
nlp
transformer
farmlibrarynlptransformer
(0)
copydelete
- community post
- history of this post
1Text Synth
https://bellard.org/textsynth/
4 years ago by @hotho
show all tags
gpt2
model
talk
test
transformer
gpt2modeltalktesttransformer
(0)
copydelete
- community post
- history of this post
1Write With Transformer
See how a modern neural network auto-completes your text
4 years ago by @nosebrain
show all tags
demo
gpt2
text
transformer
write
demogpt2texttransformerwrite
(0)
copydelete
- community post
- history of this post
1Transformers are Graph Neural Networks | NTU Graph Deep Learning Lab
Engineer friends often ask me: Graph Deep Learning sounds great, but are there any big commercial success stories? Is it being deployed in practical applications? Besides the obvious ones–recommendation systems at Pinterest, Alibaba and Twitter–a slightly nuanced success story is the Transformer architecture, which has taken the NLP industry by storm. Through this post, I want to establish links between Graph Neural Networks (GNNs) and Transformers. I’ll talk about the intuitions behind model architectures in the NLP and GNN communities, make connections using equations and figures, and discuss how we could work together to drive progress.
4 years ago by @hotho
show all tags
deep
graph
learning
network
neural
nn
transformer
deepgraphlearningnetworkneuralnntransformer
(0)
copydelete
- community post
- history of this post

⟨⟨
⟨
1
2
⟩
⟩⟩

publications (hide)199
display
all
publications only
publications per page
5
10
20
50
100
sort by
added at
title
author
publication date
entry type
help for advanced sorting...
RSS
BibTeX
RDF
more...

31Attention is all you need
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. Gomez, {. Kaiser, and I. Polosukhin. Advances in Neural Information Processing Systems, page 5998--6008. (2017)
a year ago by @andolab
show all tags
Transformer
Transformer
(1)
copydeleteadd this publication to your clipboard
2TabNet: Attentive Interpretable Tabular Learning.
S. Arik, and T. Pfister. CoRR, (2019)
a year ago by @tritsch
show all tags
neuralnetwork
tabular
transformer
neuralnetworktabulartransformer
(0)
copydeleteadd this publication to your clipboard
3Scaling Laws for Neural Language Models.
J. Kaplan, S. McCandlish, T. Henighan, T. Brown, B. Chess, R. Child, S. Gray, A. Radford, J. Wu, and D. Amodei. CoRR, (2020)
a year ago by @ghagerer
show all tags
llms
lstm
transformer
llmslstmtransformer
(0)
copydeleteadd this publication to your clipboard
7LoRA: Low-Rank Adaptation of Large Language Models
E. Hu, Y. Shen, P. Wallis, Z. Allen-Zhu, Y. Li, S. Wang, L. Wang, and W. Chen. (2021)cite arxiv:2106.09685Comment: Draft V2 includes better baselines, experiments on GLUE, and more on adapter latency.
a year ago by @ghagerer
show all tags
fine-tuning
llms
transfer-learning
transformer
fine-tuningllmstransfer-learningtransformer
(0)
copydeleteadd this publication to your clipboard
5SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems
A. Wang, Y. Pruksachatkun, N. Nangia, A. Singh, J. Michael, F. Hill, O. Levy, and S. Bowman. (2019)cite arxiv:1905.00537Comment: NeurIPS 2019, super.gluebenchmark.com updating acknowledegments.
a year ago by @annakrause
show all tags
idea:bee_audio_llm
idea:big_data_geo_2
superglue
transformer
idea:bee_audio_llmidea:big_data_geo_2supergluetransformer
(0)
copydeleteadd this publication to your clipboard
1Entity Extraction in Low Resource Domains with Selective Pre-training of Large Language Models
A. Mahapatra, S. Nangi, A. Garimella, and A. N. Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, page 942--951. Abu Dhabi, United Arab Emirates, Association for Computational Linguistics, (December 2022)
a year ago by @jonas.kaiser
show all tags
bert
low_resource
nlp
pretraining
transformer
bertlow_resourcenlppretrainingtransformer
(0)
copydeleteadd this publication to your clipboard
3Do Transformers Really Perform Badly for Graph Representation?
C. Ying, T. Cai, S. Luo, S. Zheng, G. Ke, D. He, Y. Shen, and T. Liu. Advances in Neural Information Processing Systems, 34, page 28877--28888. Curran Associates, Inc., (2021)
a year ago by @tobias.koopmann
show all tags
diss
graph
readinglist
seminar24
transformer
dissgraphreadinglistseminar24transformer
(0)
copydeleteadd this publication to your clipboard
1Multiformer: A Head-Configurable Transformer-Based Model for Direct Speech Translation
G. Sant, G. Gállego, B. Alastruey, and M. Costa jussà. Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Student Research Workshop, page 277--284. Hybrid: Seattle, Washington + Online, Association for Computational Linguistics, (July 2022)
a year ago by @annakrause
show all tags
idea:big_data_geo_2
multiformer
transformer
idea:big_data_geo_2multiformertransformer
(0)
copydeleteadd this publication to your clipboard
2RelTR: Relation Transformer for Scene Graph Generation
Y. Cong, M. Yang, and B. Rosenhahn. IEEE transactions on pattern analysis and machine intelligence (TPAMI), (2023)
a year ago by @tntl3s
show all tags
Relation
Transformer
leibnizailab
myown
RelationTransformerleibnizailabmyown
(0)
copydeleteadd this publication to your clipboard
2TimesNet: Temporal 2D-Variation Modeling for General Time Series Analysis
H. Wu, T. Hu, Y. Liu, H. Zhou, J. Wang, and M. Long. (2022)cite arxiv:2210.02186.
a year ago by @qilinw
show all tags
time_series
transformer
time_seriestransformer
(0)
copydeleteadd this publication to your clipboard
2Are Transformers Effective for Time Series Forecasting?
A. Zeng, M. Chen, L. Zhang, and Q. Xu. (2022)cite arxiv:2205.13504Comment: Code is available at https://github.com/cure-lab/LTSF-Linear.
a year ago by @qilinw
show all tags
time_series
transformer
time_seriestransformer
(0)
copydeleteadd this publication to your clipboard
2Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting
H. Zhou, S. Zhang, J. Peng, S. Zhang, J. Li, H. Xiong, and W. Zhang. (2020)cite arxiv:2012.07436Comment: 8 pages (main), 5 pages (appendix) and to be appeared in AAAI2021.
a year ago by @qilinw
show all tags
time_series
transformer
time_seriestransformer
(0)
copydeleteadd this publication to your clipboard
5An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly and 2 other author(s). (2020)cite arxiv:2010.11929Comment: Fine-tuning code and pre-trained models are available at https://github.com/google-research/vision_transformer. ICLR camera-ready version with 2 small modifications: 1) Added a discussion of CLS vs GAP classifier in the appendix, 2) Fixed an error in exaFLOPs computation in Figure 5 and Table 6 (relative performance of models is basically not affected).
a year ago by @qilinw
show all tags
transformer
transformer
(1)
copydeleteadd this publication to your clipboard
31Attention Is All You Need
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. Gomez, L. Kaiser, and I. Polosukhin. (2017)cite arxiv:1706.03762Comment: 15 pages, 5 figures.
a year ago by @qilinw
show all tags
transformer
transformer
(1)
copydeleteadd this publication to your clipboard
3On Generalization in Coreference Resolution
S. Toshniwal, P. Xia, S. Wiseman, K. Livescu, and K. Gimpel. Proceedings of the Fourth Workshop on Computational Models of Reference, Anaphora and Coreference, page 111--120. Punta Cana, Dominican Republic, Association for Computational Linguistics, (November 2021)
a year ago by @mschwab
show all tags
bert
coreference
coreference-resolution
longformer
transformer
bertcoreferencecoreference-resolutionlongformertransformer
(0)
copydeleteadd this publication to your clipboard
3On Generalization in Coreference Resolution
S. Toshniwal, P. Xia, S. Wiseman, K. Livescu, and K. Gimpel. (2021)cite arxiv:2109.09667Comment: CRAC 2021.
a year ago by @mschwab
show all tags
bert
coreference
coreference-resolution
longformer
transformer
bertcoreferencecoreference-resolutionlongformertransformer
(0)
copydeleteadd this publication to your clipboard
1Transformers in Time Series: A Survey
Q. Wen, T. Zhou, C. Zhang, W. Chen, Z. Ma, Z. Yan, J. Yan, and L. Sun. arxiv, (2023)
a year ago by @tritsch
show all tags
survey
time_series
transformer
surveytime_seriestransformer
(0)
copydeleteadd this publication to your clipboard
31Attention Is All You Need
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. Gomez, L. Kaiser, and I. Polosukhin. (2017)cite arxiv:1706.03762Comment: 15 pages, 5 figures.
a year ago by @jil
show all tags
attention
basic
gpt
llm
model
paper
self
transformer
attentionbasicgptllmmodelpaperselftransformer
(1)
copydeleteadd this publication to your clipboard
10Transformer-XL: Attentive Language Models beyond a Fixed-Length Context
Z. Dai, Z. Yang, Y. Yang, J. Carbonell, Q. Le, and R. Salakhutdinov. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, page 2978--2988. Florence, Italy, Association for Computational Linguistics, (July 2019)
a year ago by @albinzehe
show all tags
infinite-context
kallimachos
nlp
transformer
infinite-contextkallimachosnlptransformer
(0)
copydeleteadd this publication to your clipboard
4Decision Transformer: Reinforcement Learning via Sequence Modeling
L. Chen, K. Lu, A. Rajeswaran, K. Lee, A. Grover, M. Laskin, P. Abbeel, A. Srinivas, and I. Mordatch. (2021)cite arxiv:2106.01345Comment: First two authors contributed equally. Last two authors advised equally.
a year ago by @jaeschke
show all tags
deeplearning
learning
machine
ml
plk
reinforcement
rl
transformer
deeplearninglearningmachinemlplkreinforcementrltransformer
(0)
copydeleteadd this publication to your clipboard